mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   EdH (https://www.mersenneforum.org/forumdisplay.php?f=152)
-   -   How I Create a Colab Session That Factors factordb Composites with YAFU (https://www.mersenneforum.org/showthread.php?t=24927)

EdH 2019-11-09 15:50

How I Create a Colab Session That Factors factordb Composites with YAFU
 
(Note: I expect to keep the first post of each of these "How I..." threads up-to-date with the latest version. Please read the rest of each thread to see what may have led to the current set of instructions.)

I will take the liberty of expecting readers to already be somewhat familiar with Google's Colaboratory sessions. There are several threads already on Colab and these should be reviewed by interested readers:

[URL="https://mersenneforum.org/showthread.php?t=24646"]Google Colaboratory Notebook?[/URL]
[URL="https://www.mersenneforum.org/showthread.php?t=24818"]GPU72 Notebook Integration...[/URL]
[URL="https://mersenneforum.org/showthread.php?p=527912"]Notebook Instance Reverse SSH and HTTP Tunnels.[/URL]
[URL="https://www.mersenneforum.org/showthread.php?t=24875"]Colab question[/URL]

I do not, as of yet, have a github account, so I have not created an upload of this to github. Others may feel free to do so, if desired.

The following is a manner to compile and install a minimally working package of YAFU. For this instance, a repository version of GMP is installed, the current version of GMP-ECM is retrieved and compiled and YAFU is retrieved and compiled. This is not a fully working version of YAFU, in that it does not include any support for NFS. Since the range of composites retrieved from factordb is well less than 95 digits in length, SIQS is used for any composite not factored by ECM.

When run, this session retrieves composites of a chosen size from factordb, factors them and submits the factors back to the db.

To use Colab, you need a Gmail account and will be required to log into that account to run a session.

On to the specifics:

Open a [URL="https://colab.research.google.com/notebooks/welcome.ipynb"]Google Colaboratory[/URL] session.
Sign in with your Google/Gmail account info.
Choose New Python3 notebook:
[code]
Menu->File->New Python3 notebook (or within popup)
[/code]Click Connect to start a session.
Edit title from Untitled... to whatever you like.
Paste the following into the Codeblock:
[code]
#########################################################
### This Colaboratory session is designed to retrieve ###
### composites from factordb.com and factor them with ###
### YAFU. The factors are then sent to factordb. ###
### ###
### To adjust the number of composites to retrieve as ###
### well as the size to retrive, change the variables ###
### below this comment block. The size of the random ###
### number to be used to help avoid collisions (1000) ###
### can also be changed, as well as the offset. ###
#########################################################

compNum = 3 # Number of composites to run
compSize = 70 # Size of composites to run
ranNum = 1000 # Number for random count
offset = 10

import fileinput
import os
import random
import subprocess
import time
import urllib.request

#reports factors to factordb
def send2db(composite, factors):
factorline = str(factors)
sendline = 'report=' + str(composite) + '%3D' + factorline
dbcall = sendline.encode('utf-8')
temp2 = urllib.request.urlopen('http://factordb.com/report.php', dbcall)

#checks to see if yafu already exists
#if it does, this portion is skipped
exists = os.path.isfile('yafu')
if exists < 1:
print("Installing system packages. . .")
subprocess.call(["chmod", "777", "/tmp"])
subprocess.call(["apt", "update"])
subprocess.call(["apt", "install", "g++", "m4", "make", "subversion", "libgmp-dev", "libtool", "p7zip", "autoconf"])
#retrieves ecm
print("Retrieving GMP-ECM. . .")
subprocess.call(["svn", "co", "svn://scm.gforge.inria.fr/svn/ecm/trunk", "ecm"])
os.chdir("/content/ecm")
subprocess.call(["libtoolize"])
subprocess.call(["autoreconf", "-i"])
subprocess.call(["./configure", "--with-gmp=/usr/local/"])
print("Compiling GMP-ECM. . .")
subprocess.call(["make"])
subprocess.call(["make", "install"])
print("Finished installing GMP-ECM. . .")
os.chdir("/content")
#retrieves YAFU
print("Retrieving YAFU. . .")
subprocess.call(["svn", "co", "https://svn.code.sf.net/p/yafu/code/branches/wip", "/content/yafu"])
os.chdir("/content/yafu")
for line in fileinput.input('Makefile', inplace=True):
print(line.rstrip().replace('CC = gcc-7.3.0', 'CC = gcc'))
for line in fileinput.input('yafu.ini', inplace=True):
print(line.rstrip().replace('% threads=1', 'threads=2'))
for line in fileinput.input('yafu.ini', inplace=True):
print(line.rstrip().replace('ecm_path=../gmp-ecm/bin/ecm', 'ecm_path=/usr/local/bin/ecm'))
print("Compiling YAFU. . .")
subprocess.call(["make", "USE_SSE41=1"])
print("Finished compiling YAFU. . .")
print("Starting the factoring of", compNum, "composites. . .\n")

#main loop
for x in range(compNum):
randnum = random.randrange(ranNum) + offset
#fetch a number from factordb
dbcall = 'http://factordb.com/listtype.php?t=3&mindig=' + str(compSize) + '&perpage=1&start=' + str(randnum) + '&download=1'
#some file processing to get the number into a format usable by yafu
temp0 = urllib.request.urlopen(dbcall)
temp1 = temp0.read()
composite = temp1.decode(encoding='UTF-8')
composite = composite.strip("\n")
fstart = time.time()
#print number being worked on
# print("Composite", x + 1,":", composite, "<",len(composite),">")
print("Composite {0}: {1} <{2}>".format( x + 1, composite,len(composite)))
#run yafu
factorT = subprocess.run(['./yafu', '-silent'], stdout=subprocess.PIPE, input=temp1)
#find factors from the yafu run in factor.log
file = open('factor.log', 'r')
string = (", prp")
fcheck = 0
factors = ""
for line in file:
found = line.rfind(string)
if found > 0:
line = line.rstrip("\n")
ind = line.rfind(" = ")
ind += 3
line = line[ind:]
if fcheck > 0:
factors = factors + "*"
line = line.split(" ", 1)[0]
factors = factors + line
fcheck += 1
os.remove("factor.log")
runtime = time.time() - fstart
#print factors found
# print("Factors:", factors)
# print("Factors (%d:%02d):" %(int(runtime / 60), int(runtime % 60)), factors, "\n")
print("Factors ({0:0>1}:{1:0>2}): {2}\n".format(int(runtime / 60), int(runtime % 60), factors))
# print("Elapsed time:", int(runtime / 60), "minutes and", int(runtime % 60), "seconds.\n")
# print("Elapsed time:", runtime("%H:%M:%S"))
#send number and factors to factordb
send2db(composite, factors)
#all numbers are completed
print("Completed all", compNum, "composites!")
[/code]Click on the Run cell icon or use CTRL-Enter.

The compilations will run for about two and a half minutes. When YAFU finishes its compilation, after a couple message blocks, if all went well, the factoring process will begin.

The current default is to factor three, 80 digit composites and stop. The factors are sent to the db automatically, so no other manual intervention is needed. To change the number of composites to work on for each run, edit the compNum variable. To change the size of the composites to work on edit the compSize variable.

Eventually, I hope to add a more detailed description of all the code.

mathwiz 2019-11-09 16:46

Now to automate it...
 
Awesome guide!

Now we just have to figure out how to wire this up to FactorDB.com so composites are factored automatically :smile:

Dylan14 2019-11-09 17:07

I can confirm that the code works. A few things:


1. It suffices to just comment out the lines above the imports after you made the code once.
2. Is there a reason why you use the build option USE_SSE41=1, instead of something that is faster like AVX2? As it appears all of the Colab entities have at least this.
3. I added some more comments to the code below the compilation:


[CODE][FONT=monospace]import random
import subprocess
import urllib.request

compNum = 2# Number of composites to run
compSize = 80# Size of composites to run
ranNum = 1000# Number for random count

defsend2db(composite, lastfactor):
#reports factors to factordb
factorline = str(lastfactor)
sendline = 'report=' + str(composite) + '%3D' + factorline
dbcall = sendline.encode('utf-8')
temp2 = urllib.request.urlopen('http://factordb.com/report.php', dbcall)

#main loop
for x inrange(compNum):#run compnum composites
randnum = random.randrange(ranNum)#pick a random number
#fetch number from factordb
dbcall = 'http://factordb.com/listtype.php?t=3&mindig=' + str(compSize) + '&perpage=1&start=' + str(randnum) + '&download=1'
#some file processing to get the number into a format usable by yafu
temp0 = urllib.request.urlopen(dbcall)
temp1 = temp0.read()
composite = temp1.decode(encoding='UTF-8')
composite = composite.strip("\n")
#print composite to test
print("The composite is", composite)
#run yafu
factorT = subprocess.run(['./yafu'], stdout=subprocess.PIPE,input=temp1)
#find factors from a yafu run
factor = factorT.stdout.decode('utf-8')
factorloc = factor.index('***factors found***')
factorloc += 22
tail = factor[factorloc:]
factors = tail[:-34]
facind = factors.rfind('=')
facind += 2
lastfactor = factors[facind:]
#print last factor found
print("The last factor is", lastfactor)
#send factors to fdb
send2db(composite, lastfactor)
#run complete
print("Completed requested number of composites!")
[/FONT]
[/CODE]

EdH 2019-11-09 18:11

[QUOTE=mathwiz;530116]Awesome guide!

Now we just have to figure out how to wire this up to FactorDB.com so composites are factored automatically :smile:[/QUOTE]Thanks, but I must not understand your comment.

Once started, the composites are retrieved, factored and uploaded to factordb.com automatically. The only manual part is the session start and choosing how many composites to work. Then, all is fully automated.

EdH 2019-11-09 18:26

[QUOTE=Dylan14;530119]I can confirm that the code works. A few things:


1. It suffices to just comment out the lines above the imports after you made the code once.
2. Is there a reason why you use the build option USE_SSE41=1, instead of something that is faster like AVX2? As it appears all of the Colab entities have at least this.
3. I added some more comments to the code below the compilation:

[/QUOTE]Thanks Dylan,

I tried a direct copy/paste and lost some formatting. I had to go back to my original. I'm being pulled away ATM, but plan to address all else later.

1. I considered a block delete easier than commenting out lines.
2. I have experienced segmentation faults with AVX2 in the past.
3. Thanks! I'll work on those later.

EdH 2019-11-09 23:30

I made some changes, but unfortunately, the AVX2 option causes SIQS to return earlier than completion and the zero value for factorloc crashes the run. I'll work on this more later.

LaurV 2019-11-10 03:52

[QUOTE=EdH;530122]Thanks, but I must not understand your comment.

Once started, the composites are retrieved, factored and uploaded to factordb.com automatically. The only manual part is the session start and choosing how many composites to work. Then, all is fully automated.[/QUOTE]
I think he meant more or less in a serious way, something along the lines that factordb itself could be "wired" to run such script on colab by itself too.

EdH 2019-11-10 04:01

[QUOTE=LaurV;530163]I think he meant more or less in a serious way, something along the lines that factordb itself could be "wired" to run such script on colab by itself too.[/QUOTE]AH! Thank you! I was correct that I must not have understood. Indeed, I did not. But now I do see how it was meant, with your assistance. I fear factordb would overrun Colab if such was the case, though. . .

LaurV 2019-11-10 04:18

[QUOTE=EdH;530164] I fear factordb would overrun Colab if such was the case, though. . .[/QUOTE]That for sure. One can not compare 20 or 50 real cores that Syd has, with 1 virtual core that colab gives you. But 101 mile per hour is better than 100 miles per hour (this I learned on this forum!)

EdH 2019-11-11 16:10

I made some major changes, all reflected in the original post.

All comments welcome. . .

bsquared 2019-11-11 17:20

Are colab sessions single threaded? If not it would be helpful to run multithreaded.


All times are UTC. The time now is 19:55.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.