mersenneforum.org  

Go Back   mersenneforum.org > Extra Stuff > Blogorrhea > EdH

Reply
 
Thread Tools
Old 2019-11-09, 15:50   #1
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

336710 Posts
Default How I Create a Colab Session That Factors factordb Composites with YAFU

(Note: I expect to keep the first post of each of these "How I..." threads up-to-date with the latest version. Please read the rest of each thread to see what may have led to the current set of instructions.)

I will take the liberty of expecting readers to already be somewhat familiar with Google's Colaboratory sessions. There are several threads already on Colab and these should be reviewed by interested readers:

Google Colaboratory Notebook?
GPU72 Notebook Integration...
Notebook Instance Reverse SSH and HTTP Tunnels.
Colab question

I do not, as of yet, have a github account, so I have not created an upload of this to github. Others may feel free to do so, if desired.

The following is a manner to compile and install a minimally working package of YAFU. For this instance, a repository version of GMP is installed, the current version of GMP-ECM is retrieved and compiled and YAFU is retrieved and compiled. This is not a fully working version of YAFU, in that it does not include any support for NFS. Since the range of composites retrieved from factordb is well less than 95 digits in length, SIQS is used for any composite not factored by ECM.

When run, this session retrieves composites of a chosen size from factordb, factors them and submits the factors back to the db.

To use Colab, you need a Gmail account and will be required to log into that account to run a session.

On to the specifics:

Open a Google Colaboratory session.
Sign in with your Google/Gmail account info.
Choose New Python3 notebook:
Code:
Menu->File->New Python3 notebook (or within popup)
Click Connect to start a session.
Edit title from Untitled... to whatever you like.
Paste the following into the Codeblock:
Code:
#########################################################
### This Colaboratory session is designed to retrieve ###
### composites from factordb.com and factor them with ###
### YAFU.  The factors are then sent to factordb.     ###
###                                                   ###
### To adjust the number of composites to retrieve as ###
### well as the size to retrive, change the variables ###
### below this comment block.  The size of the random ###
### number to be used to help avoid collisions (1000) ###
### can also be changed, if desired.                  ###
#########################################################

compNum = 3 # Number of composites to run
compSize = 80 # Size of composites to run
ranNum = 1000 # Number for random count

import os
import random
import subprocess
import urllib.request

#reports factors to factordb
def send2db(composite, factors):
  factorline = str(factors)
  sendline = 'report=' + str(composite) + '%3D' + factorline
  dbcall = sendline.encode('utf-8')
  temp2 = urllib.request.urlopen('http://factordb.com/report.php', dbcall)

#checks to see if yafu already exists
#if it does, this portion is skipped
exists = os.path.isfile('yafu')
if exists < 1:
  print("Installing system packages. . .")
  subprocess.call(["chmod", "777", "/tmp"])
  subprocess.call(["apt", "update"])
  subprocess.call(["apt", "install", "g++", "m4", "make", "subversion", "libgmp-dev", "libtool", "p7zip", "autoconf"])
#retrieves ecm
  print("Retrieving GMP-ECM. . .")
  subprocess.call(["svn", "co", "svn://scm.gforge.inria.fr/svn/ecm/trunk", "ecm"])
  os.chdir("/content/ecm")
  subprocess.call(["libtoolize"])
  subprocess.call(["autoreconf", "-i"])
  subprocess.call(["./configure", "--with-gmp=/usr/local/"])
  print("Compiling GMP-ECM. . .")
  subprocess.call(["make"])
  subprocess.call(["make", "install"])
  print("Finished installing GMP-ECM. . .")
  os.chdir("/content")
#retrieves YAFU
  print("Retrieving YAFU. . .")
  subprocess.call(["svn", "co", "https://svn.code.sf.net/p/yafu/code/trunk", "/content/yafu"])
  os.chdir("/content/yafu")
  subprocess.call(["mv", "yafu.ini", "yafu.ini.orig"])
  with open("yafu.ini", "a+") as yafuini:
    yafuini.write("B1pm1=100000\n")
    yafuini.write("B1pp1=20000\n")
    yafuini.write("B1ecm=11000\n")
    yafuini.write("rhomax=200\n")
    yafuini.write("threads=2\n")
    yafuini.write("pretest_ratio=0.25\n")
    yafuini.write("ecm_path=/usr/local/bin/ecm\n")
    yafuini.write("xover=93\n")
    yafuini.write("no_clk_test=1")
  print("Compiling YAFU. . .")
  subprocess.call(["make", "x86_64", "USE_SSE41=1", "USE_AVX512=1"])
  print("Finished compiling YAFU. . .")
print("Starting the factoring of", compNum, "composites. . .\n")

#main loop
for x in range(compNum):
  randnum = random.randrange(ranNum)
#fetch a number from factordb
  dbcall = 'http://factordb.com/listtype.php?t=3&mindig=' + str(compSize) + '&perpage=1&start=' + str(randnum) + '&download=1'
#some file processing to get the number into a format usable by yafu
  temp0 = urllib.request.urlopen(dbcall)
  temp1 = temp0.read()
  composite = temp1.decode(encoding='UTF-8')
  composite = composite.strip("\n")
#print number being worked on
  print("Composite", x + 1, "is:", composite)
#run yafu
  factorT = subprocess.run(['./yafu', '-silent'], stdout=subprocess.PIPE, input=temp1)
#find factors from the yafu run in factor.log
  file = open('factor.log', 'r')
  string = (", prp")
  fcheck = 0
  factors = ""
  for line in file:
    found = line.rfind(string)
    if found > 0:
      line = line.rstrip("\n")
      ind = line.rfind(" = ")
      ind += 3
      line = line[ind:]
      if fcheck > 0:
        factors = factors + "*"
      line = line.split(" ", 1)[0]
      factors = factors + line
      fcheck += 1
  os.remove("factor.log")
#print factors found
  print("Factors for", x + 1, "are:", factors, "\n")
#send number and factors to factordb
  send2db(composite, factors)
#all numbers are completed
print("Completed all", compNum, "composites!")
Click on the Run cell icon or use CTRL-Enter.

The compilations will run for about two and a half minutes. When YAFU finishes its compilation, after a couple message blocks, if all went well, the factoring process will begin.

The current default is to factor three, 80 digit composites and stop. The factors are sent to the db automatically, so no other manual intervention is needed. To change the number of composites to work on for each run, edit the compNum variable. To change the size of the composites to work on edit the compSize variable.

Eventually, I hope to add a more detailed description of all the code.

Last fiddled with by EdH on 2019-12-03 at 22:48
EdH is offline   Reply With Quote
Old 2019-11-09, 16:46   #2
mathwiz
 
Mar 2019

2·72 Posts
Default Now to automate it...

Awesome guide!

Now we just have to figure out how to wire this up to FactorDB.com so composites are factored automatically
mathwiz is offline   Reply With Quote
Old 2019-11-09, 17:07   #3
Dylan14
 
Dylan14's Avatar
 
"Dylan"
Mar 2017

11·47 Posts
Default

I can confirm that the code works. A few things:


1. It suffices to just comment out the lines above the imports after you made the code once.
2. Is there a reason why you use the build option USE_SSE41=1, instead of something that is faster like AVX2? As it appears all of the Colab entities have at least this.
3. I added some more comments to the code below the compilation:


Code:
import random
import subprocess
import urllib.request

compNum = 2# Number of composites to run
compSize = 80# Size of composites to run
ranNum = 1000# Number for random count

defsend2db(composite, lastfactor):
#reports factors to factordb
  factorline = str(lastfactor)
  sendline = 'report=' + str(composite) + '%3D' + factorline
  dbcall = sendline.encode('utf-8')
  temp2 = urllib.request.urlopen('http://factordb.com/report.php', dbcall)

#main loop
for x inrange(compNum):#run compnum composites
  randnum = random.randrange(ranNum)#pick a random number
#fetch number from factordb
  dbcall = 'http://factordb.com/listtype.php?t=3&mindig=' + str(compSize) + '&perpage=1&start=' + str(randnum) + '&download=1'
#some file processing to get the number into a format usable by yafu
  temp0 = urllib.request.urlopen(dbcall)
  temp1 = temp0.read()
  composite = temp1.decode(encoding='UTF-8')
  composite = composite.strip("\n")
#print composite to test
print("The  composite  is", composite)
#run yafu
  factorT = subprocess.run(['./yafu'], stdout=subprocess.PIPE,input=temp1)
#find factors from a yafu run
  factor = factorT.stdout.decode('utf-8')
  factorloc = factor.index('***factors found***')
  factorloc += 22
  tail = factor[factorloc:]
  factors = tail[:-34]
  facind = factors.rfind('=')
  facind += 2
  lastfactor = factors[facind:]
#print last factor found
print("The last factor is", lastfactor)
#send factors to fdb
  send2db(composite, lastfactor)
#run complete  
print("Completed requested number of composites!")
Dylan14 is offline   Reply With Quote
Old 2019-11-09, 18:11   #4
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

64478 Posts
Default

Quote:
Originally Posted by mathwiz View Post
Awesome guide!

Now we just have to figure out how to wire this up to FactorDB.com so composites are factored automatically
Thanks, but I must not understand your comment.

Once started, the composites are retrieved, factored and uploaded to factordb.com automatically. The only manual part is the session start and choosing how many composites to work. Then, all is fully automated.
EdH is offline   Reply With Quote
Old 2019-11-09, 18:26   #5
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

1101001001112 Posts
Default

Quote:
Originally Posted by Dylan14 View Post
I can confirm that the code works. A few things:


1. It suffices to just comment out the lines above the imports after you made the code once.
2. Is there a reason why you use the build option USE_SSE41=1, instead of something that is faster like AVX2? As it appears all of the Colab entities have at least this.
3. I added some more comments to the code below the compilation:
Thanks Dylan,

I tried a direct copy/paste and lost some formatting. I had to go back to my original. I'm being pulled away ATM, but plan to address all else later.

1. I considered a block delete easier than commenting out lines.
2. I have experienced segmentation faults with AVX2 in the past.
3. Thanks! I'll work on those later.
EdH is offline   Reply With Quote
Old 2019-11-09, 23:30   #6
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

64478 Posts
Default

I made some changes, but unfortunately, the AVX2 option causes SIQS to return earlier than completion and the zero value for factorloc crashes the run. I'll work on this more later.
EdH is offline   Reply With Quote
Old 2019-11-10, 03:52   #7
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

2×3×1,471 Posts
Default

Quote:
Originally Posted by EdH View Post
Thanks, but I must not understand your comment.

Once started, the composites are retrieved, factored and uploaded to factordb.com automatically. The only manual part is the session start and choosing how many composites to work. Then, all is fully automated.
I think he meant more or less in a serious way, something along the lines that factordb itself could be "wired" to run such script on colab by itself too.
LaurV is offline   Reply With Quote
Old 2019-11-10, 04:01   #8
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

7×13×37 Posts
Default

Quote:
Originally Posted by LaurV View Post
I think he meant more or less in a serious way, something along the lines that factordb itself could be "wired" to run such script on colab by itself too.
AH! Thank you! I was correct that I must not have understood. Indeed, I did not. But now I do see how it was meant, with your assistance. I fear factordb would overrun Colab if such was the case, though. . .
EdH is offline   Reply With Quote
Old 2019-11-10, 04:18   #9
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

100010011110102 Posts
Default

Quote:
Originally Posted by EdH View Post
I fear factordb would overrun Colab if such was the case, though. . .
That for sure. One can not compare 20 or 50 real cores that Syd has, with 1 virtual core that colab gives you. But 101 mile per hour is better than 100 miles per hour (this I learned on this forum!)
LaurV is offline   Reply With Quote
Old 2019-11-11, 16:10   #10
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

336710 Posts
Default

I made some major changes, all reflected in the original post.

All comments welcome. . .
EdH is offline   Reply With Quote
Old 2019-11-11, 17:20   #11
bsquared
 
bsquared's Avatar
 
"Ben"
Feb 2007

329410 Posts
Default

Are colab sessions single threaded? If not it would be helpful to run multithreaded.
bsquared is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Colab question David703 GPU to 72 221 2020-10-07 01:23
How I Compile YAFU in a Colaboratory Session EdH EdH 2 2019-10-28 03:10
Someone is reporting orphan factors to FactorDB. GP2 FactorDB 6 2018-07-24 19:45
Cunningham Table Composites in FactorDB Batalov Cunningham Tables 15 2011-07-30 03:43
Factoring of composites with near factors - request for data AntonVrba Factoring 3 2006-02-05 06:30

All times are UTC. The time now is 09:37.

Tue Oct 20 09:37:06 UTC 2020 up 40 days, 6:48, 0 users, load averages: 1.60, 1.47, 1.46

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.