mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > FactorDB

Reply
 
Thread Tools
Old 2020-11-18, 03:47   #34
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

DD516 Posts
Default

Quote:
Originally Posted by chris2be8 View Post
If you have GP installed try something like:
Code:
echo 'factor($n,10^5)' | gp -f -q
That TFs a number up to 10^5 which is a much better way to split off small factors than running ECM.

Chris
I implemented this with 10^7 (because 10^5 and 10^6 failed several) and it is flying at just over 1 sec/composite! Thanks!

However, I've done about 7000, 25dd composites and at the same time, the count of 25dd composites rose from ~116k to ~163k. That seems to be in the wrong direction. And, my other machines are still offline.
EdH is offline   Reply With Quote
Old 2020-11-18, 08:29   #35
Happy5214
 
Happy5214's Avatar
 
"Alexander"
Nov 2008
The Alamo City

1B916 Posts
Default

Do you have a script (Linux) I can borrow? I might put a core or two on this in a day or so. Seems more helpful than the low-end aliquot sequences I'm working on now.
Happy5214 is offline   Reply With Quote
Old 2020-11-18, 11:05   #36
penlu
 
Jul 2018

3110 Posts
Default

Here is some python: https://github.com/penlu/factoring

The code is messy and variously suboptimal. Let me know if something is broken or if you need additional information to get it working.

To avoid repeatedly poking disk with msieve's save files, I run it as:
Code:
TMPDIR=/dev/shm python worker2.py
I have a machine working at 20dd, offset 40000. This script is using rather less than one core. The bottleneck for such small composites appears to be communication with factordb: it allows us four parallel requests per IP. The script above uses two requests per number: one to query its ID and one to submit its factorization. With the response time I get from factordb, this apparently works out to some 250 factored per minute. This is maybe about the throughput the script will manage per IP address it is run on.
penlu is offline   Reply With Quote
Old 2020-11-18, 14:49   #37
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

3,541 Posts
Default

Quote:
Originally Posted by Happy5214 View Post
Do you have a script (Linux) I can borrow? I might put a core or two on this in a day or so. Seems more helpful than the low-end aliquot sequences I'm working on now.
Here's a somewhat simple bash script:
Code:
#!/bin/bash/################################
# Note that this script uses two temporary
# files: dbTemp and dbSuccess. It will alter
# and delete previous files with these names.
# This script is single threaded. If using
# multiple instances, you should keep track
# of the hourly limits for factordb.
############################################
  
digitsize=25
totalrun=5

printf "Factoring $totalrun composites:\n\n"
for ((n=0;n<totalrun;n++))
  do
    startt=SECONDS
    randomnumber=$(echo $((10 + RANDOM % 1000)))
    wget "http://factordb.com/listtype.php?t=3&mindig=${digitsize}&perpage=1&start=$randomnumber&download=1" -q -O dbTemp
    exec <"dbTemp"
      read composite in
    echo "Composite $((${n}+1)) of $totalrun is $composite <${#composite}>"
    temp=$(echo "factor($composite, 10^7)" | gp -f -q)
    temp=$(echo "${temp//\n/}" | xargs)
    temp=$(echo "${temp// 1]/}")
    temp=$(echo "${temp//]/}")
    temp=$(echo "${temp//[/*}")
    temp=$(echo "${temp//' '/}")
    printf "Factors are ${temp:1} <$((${SECONDS}-${startt}))s>\n\n"
    returnfactors=${composite}%3D${temp:1}
    wget "http://factordb.com/report.php?report=$returnfactors" -q -O dbSuccess
  done
echo "Total time for $n composites was $SECONDS seconds."
rm dbTemp
Let me know of any programming flaws or troubles.

Last fiddled with by EdH on 2020-11-21 at 16:12 Reason: Bug erradication!
EdH is offline   Reply With Quote
Old 2020-11-18, 17:17   #38
chris2be8
 
chris2be8's Avatar
 
Sep 2009

36748 Posts
Default

Quote:
Originally Posted by EdH View Post
Thanks! I'll definitely try to set that up on my RPi machines.

But, my "farm" machines have almost all crashed! They were working 70dd+10 and they're all coming up with gibberish:
Code:
href="res.php">limits</a>) already factored
Error, no result found
Factoring 3 digits: (<a
compCheck                                     100%  127KB   9.8MB/s   00:00    
(<a already factored
(<a already factored
Error, no result found
Factoring 28 digits: href="imp.html">Imprint</a>)
compCheck                                     100%  127KB   9.7MB/s   00:00    
href="imp.html">Imprint</a>) already factored
Error, no result found
Factoring 3 digits: (<a
compCheck                                     100%  127KB   9.5MB/s   00:00    
(<a already factored
(<a already factored
Error, no result found
Factoring 31 digits: href="datenschutz.html">Privacy
compCheck                                     100%  127KB   9.3MB/s   00:00    
href="datenschutz.html">Privacy already factored
Error, no result found
Factoring 11 digits: Policy</a>)
compCheck                                     100%  127KB   9.6MB/s   00:00    
Policy</a>) already factored
Error, no result found
Factoring 0 digits: 
compCheck                                     100%  127KB   9.6MB/s   00:00    
 already factored
Error, no result found
(The compCheck is a file that contains a list of all the composites that any of the machines has factored, to minimize duplicates. I have to trim it often.)

I had to take the entire set down for now.
To judge by the bits of html factordb is sending a web page instead of a number to factor. Which might well be saying you have reached a limit on how much you can do in an hour. The limits are by IP address connecting to factordb so are shared by everything at that IP address.

I've had to update my scripts to wait for a few minutes before trying again when they see something like that. And dump whatever they received to a log where I can see it.

I've also seen messages like:
Code:
1040: SQLSTATE[HY000] [1040] Too many connections
Which is an internal problem in factordb.

Chris
chris2be8 is offline   Reply With Quote
Old 2020-11-18, 20:19   #39
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

3,541 Posts
Default

Actually, I wasn't anywhere near the hourly limits, but I think what I hit was, "You have reached the maximum of 4 parallel processing requests (?). Please wait a few seconds and try again." I had so many machines that were all factoring so fast, they hit the limit of 4 requests at once. And, other machines kept accessing fine, but the affected ones wouldn't correct themselves once they started failing.

I'm working on a bash script, based on the one posted, that runs the gp function first, and then runs YAFU, if factors are not found. YAFU doesn't seem to be giving me the really small factors (in factor.log) for all these multiple prime composites.

I'm hoping to build in checking for a composite, because the Python script I was using couldn't seem to fix itself. For my own use, I also have a machine that tracks composites that have been started so they "shouldn't" be started by another machine before being finished by the first. That's where the "already factored" comes from. In this case it was actually referring to the HTML messages as triggering that one.


Edit: P.S. Thanks again for the gp code.

Last fiddled with by EdH on 2020-11-18 at 20:20
EdH is offline   Reply With Quote
Old 2020-11-18, 22:25   #40
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

110108 Posts
Default

Quote:
Originally Posted by EdH View Post
Here's a somewhat simple bash script:
...
Let me know of any programming flaws or troubles.
I am now an elf! Your script prompted me to install pari-gp, and I'm running it on 27-digit composites. Not sure how long I'll leave it going, but at 26 and 27 digits most of the numbers split into 3 9-digit factors; so I changed the factor bound to 10^9.

Maybe I'll try the python script tomorrow, so I can do "real" elf work.
VBCurtis is offline   Reply With Quote
Old 2020-11-18, 22:35   #41
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

354110 Posts
Default

Quote:
Originally Posted by VBCurtis View Post
I am now an elf! Your script prompted me to install pari-gp, and I'm running it on 27-digit composites. Not sure how long I'll leave it going, but at 26 and 27 digits most of the numbers split into 3 9-digit factors; so I changed the factor bound to 10^9.

Maybe I'll try the python script tomorrow, so I can do "real" elf work.
I'm happy to be inspirational. I have a script now that invokes YAFU after it runs the gp part as a test for smaller primes. I have it running on several machines right now for testing at 70 dd + 10. I intend to provide it as a, "How I. . ." thread once I polish it a bit. I think I'll make it a Colab session first and then consider a desktop version.
EdH is offline   Reply With Quote
Old 2020-11-19, 18:26   #42
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

3,541 Posts
Default

Well, Bummer! I ran into the "maximum of 4 parallel processing requests" mentioned before, in a too big way. Because of my scripts running factoring continuously, once this occurred, the machines kept trying until all my machines got involved whenever each one completed its current factorization. This quickly overran the "Page requests" limit and effectively knocked my whole "farm" out, until I told them all to stop. I'm implementing some code to (hopefully) keep this from happening again (or, at least often).

The above trouble shouldn't affect the script I posted earlier, since I expect no one to run it on more than a few threads, but let me know if I should edit it as well.

Now, what happened? I thought everything below 70 digits was gone, but now 75k have shown back up while I was locked out. I'm not egotistical enough to think they appeared because I wasn't running my machines. . .
EdH is offline   Reply With Quote
Old 2020-11-20, 01:47   #43
warachwe
 
Aug 2020

23 Posts
Default

Quote:
Originally Posted by EdH View Post
Now, what happened? I thought everything below 70 digits was gone, but now 75k have shown back up while I was locked out. I'm not egotistical enough to think they appeared because I wasn't running my machines. . .
This is what I guess was happened.
On 15th November, someone enter a number of form bn with b<20000 and n<1000 or somewhere around that ballpark. Since then DB slowly determined whether each number is prime, and while doing so produced a factor which is product of small prime. When that factor is more than 19 digit it won't get factor more. (I think the DB automatically does this for all unknown number less than 20,000 digits). This is where those small composite came from.

The problem is there might be up to 10^7 unknown number waiting in queue, according to Distribution of numbers by digits. Those will continue to produce small composite for quite a while.
warachwe is offline   Reply With Quote
Old 2020-11-20, 02:51   #44
penlu
 
Jul 2018

31 Posts
Default

I can't get enough numbers to run into the page request limit. Instead I've been running into the "IDs created" limit. Submitting a factorization of a number into a product of many small (< 10000) primes seems to create many IDs; I don't understand why.
penlu is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Is there a tool that picks off small composites constantly? fivemack FactorDB 14 2018-02-07 17:00
Command-line program for factoring small numbers James Heinrich Software 22 2011-08-29 16:35
What's the point of factoring known composites? ixfd64 PrimeNet 4 2011-02-21 11:51
A small factoring program Yamato Factoring 2 2007-11-21 23:29
Factoring of composites with near factors - request for data AntonVrba Factoring 3 2006-02-05 06:30

All times are UTC. The time now is 22:03.

Thu Jan 21 22:03:34 UTC 2021 up 49 days, 18:14, 0 users, load averages: 3.01, 2.43, 2.29

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.