mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   FactorDB (https://www.mersenneforum.org/forumdisplay.php?f=94)
-   -   Factoring small composites (https://www.mersenneforum.org/showthread.php?t=24411)

EdH 2020-11-18 03:47

[QUOTE=chris2be8;563523]If you have GP installed try something like:
[code]
echo 'factor($n,10^5)' | gp -f -q
[/code]That TFs a number up to 10^5 which is a much better way to split off small factors than running ECM.

Chris[/QUOTE]
I implemented this with 10^7 (because 10^5 and 10^6 failed several) and it is flying at just over 1 sec/composite! Thanks!

However, I've done about 7000, 25dd composites and at the same time, the count of 25dd composites rose from ~116k to ~163k. That seems to be in the wrong direction. And, my other machines are still offline.

Happy5214 2020-11-18 08:29

Do you have a script (Linux) I can borrow? I might put a core or two on this in a day or so. Seems more helpful than the low-end aliquot sequences I'm working on now.

penlu 2020-11-18 11:05

Here is some python: [URL="https://github.com/penlu/factoring"]https://github.com/penlu/factoring[/URL]

The code is messy and variously suboptimal. Let me know if something is broken or if you need additional information to get it working.

To avoid repeatedly poking disk with msieve's save files, I run it as:
[CODE]TMPDIR=/dev/shm python worker2.py[/CODE]

I have a machine working at 20dd, offset 40000. This script is using rather less than one core. The bottleneck for such small composites appears to be communication with factordb: it allows us four parallel requests per IP. The script above uses two requests per number: one to query its ID and one to submit its factorization. With the response time I get from factordb, this apparently works out to some 250 factored per minute. This is maybe about the throughput the script will manage per IP address it is run on.

EdH 2020-11-18 14:49

[QUOTE=Happy5214;563620]Do you have a script (Linux) I can borrow? I might put a core or two on this in a day or so. Seems more helpful than the low-end aliquot sequences I'm working on now.[/QUOTE]
Here's a somewhat simple bash script:
[code]
#!/bin/bash/################################
# Note that this script uses two temporary
# files: dbTemp and dbSuccess. It will alter
# and delete previous files with these names.
# This script is single threaded. If using
# multiple instances, you should keep track
# of the hourly limits for factordb.
############################################

digitsize=25
totalrun=5

printf "Factoring $totalrun composites:\n\n"
for ((n=0;n<totalrun;n++))
do
startt=SECONDS
randomnumber=$(echo $((10 + RANDOM % 1000)))
wget "http://factordb.com/listtype.php?t=3&mindig=${digitsize}&perpage=1&start=$randomnumber&download=1" -q -O dbTemp
exec <"dbTemp"
read composite in
echo "Composite $((${n}+1)) of $totalrun is $composite <${#composite}>"
temp=$(echo "factor($composite, 10^7)" | gp -f -q)
temp=$(echo "${temp//\n/}" | xargs)
temp=$(echo "${temp// 1]/}")
temp=$(echo "${temp//]/}")
temp=$(echo "${temp//[/*}")
temp=$(echo "${temp//' '/}")
printf "Factors are ${temp:1} <$((${SECONDS}-${startt}))s>\n\n"
returnfactors=${composite}%3D${temp:1}
wget "http://factordb.com/report.php?report=$returnfactors" -q -O dbSuccess
done
echo "Total time for $n composites was $SECONDS seconds."
rm dbTemp
[/code]Let me know of any programming flaws or troubles.

chris2be8 2020-11-18 17:17

[QUOTE=EdH;563530]Thanks! I'll definitely try to set that up on my RPi machines.

But, my "farm" machines have almost all crashed! They were working 70dd+10 and they're all coming up with gibberish:
[code]
href="res.php">limits</a>) already factored
Error, no result found
Factoring 3 digits: (<a
compCheck 100% 127KB 9.8MB/s 00:00
(<a already factored
(<a already factored
Error, no result found
Factoring 28 digits: href="imp.html">Imprint</a>)
compCheck 100% 127KB 9.7MB/s 00:00
href="imp.html">Imprint</a>) already factored
Error, no result found
Factoring 3 digits: (<a
compCheck 100% 127KB 9.5MB/s 00:00
(<a already factored
(<a already factored
Error, no result found
Factoring 31 digits: href="datenschutz.html">Privacy
compCheck 100% 127KB 9.3MB/s 00:00
href="datenschutz.html">Privacy already factored
Error, no result found
Factoring 11 digits: Policy</a>)
compCheck 100% 127KB 9.6MB/s 00:00
Policy</a>) already factored
Error, no result found
Factoring 0 digits:
compCheck 100% 127KB 9.6MB/s 00:00
already factored
Error, no result found
[/code](The compCheck is a file that contains a list of all the composites that any of the machines has factored, to minimize duplicates. I have to trim it often.)

I had to take the entire set down for now.:sad:[/QUOTE]

To judge by the bits of html factordb is sending a web page instead of a number to factor. Which might well be saying you have reached a limit on how much you can do in an hour. The limits are by IP address connecting to factordb so are shared by everything at that IP address.

I've had to update my scripts to wait for a few minutes before trying again when they see something like that. And dump whatever they received to a log where I can see it.

I've also seen messages like:
[code]
1040: SQLSTATE[HY000] [1040] Too many connections
[/code]

Which is an internal problem in factordb.

Chris

EdH 2020-11-18 20:19

Actually, I wasn't anywhere near the hourly limits, but I think what I hit was, "You have reached the maximum of 4 parallel processing requests ([URL="http://www.factordb.com/help.php?page=2"]?[/URL]). Please wait a few seconds and try again." I had so many machines that were all factoring so fast, they hit the limit of 4 requests at once. And, other machines kept accessing fine, but the affected ones wouldn't correct themselves once they started failing.

I'm working on a bash script, based on the one posted, that runs the gp function first, and then runs YAFU, if factors are not found. YAFU doesn't seem to be giving me the really small factors (in factor.log) for all these multiple prime composites.

I'm hoping to build in checking for a composite, because the Python script I was using couldn't seem to fix itself. For my own use, I also have a machine that tracks composites that have been started so they "shouldn't" be started by another machine before being finished by the first. That's where the "already factored" comes from. In this case it was actually referring to the HTML messages as triggering that one.


Edit: P.S. Thanks again for the gp code.

VBCurtis 2020-11-18 22:25

[QUOTE=EdH;563662]Here's a somewhat simple bash script:
...
Let me know of any programming flaws or troubles.[/QUOTE]

I am now an elf! Your script prompted me to install pari-gp, and I'm running it on 27-digit composites. Not sure how long I'll leave it going, but at 26 and 27 digits most of the numbers split into 3 9-digit factors; so I changed the factor bound to 10^9.

Maybe I'll try the python script tomorrow, so I can do "real" elf work.

EdH 2020-11-18 22:35

[QUOTE=VBCurtis;563704]I am now an elf! Your script prompted me to install pari-gp, and I'm running it on 27-digit composites. Not sure how long I'll leave it going, but at 26 and 27 digits most of the numbers split into 3 9-digit factors; so I changed the factor bound to 10^9.

Maybe I'll try the python script tomorrow, so I can do "real" elf work.[/QUOTE]
I'm happy to be inspirational. I have a script now that invokes YAFU after it runs the gp part as a test for smaller primes. I have it running on several machines right now for testing at 70 dd + 10. I intend to provide it as a, "How I. . ." thread once I polish it a bit. I think I'll make it a Colab session first and then consider a desktop version.

EdH 2020-11-19 18:26

Well, Bummer! I ran into the "maximum of 4 parallel processing requests" mentioned before, in a too big way. Because of my scripts running factoring continuously, once this occurred, the machines kept trying until all my machines got involved whenever each one completed its current factorization. This quickly overran the "Page requests" limit and effectively knocked my whole "farm" out, until I told them all to stop. I'm implementing some code to (hopefully) keep this from happening again (or, at least often).

The above trouble shouldn't affect the script I posted earlier, since I expect no one to run it on more than a few threads, but let me know if I should edit it as well.

Now, what happened? I thought everything below 70 digits was gone, but now 75k have shown back up while I was locked out. I'm not egotistical enough to think they appeared because I wasn't running my machines. . .

warachwe 2020-11-20 01:47

[QUOTE=EdH;563778]
Now, what happened? I thought everything below 70 digits was gone, but now 75k have shown back up while I was locked out. I'm not egotistical enough to think they appeared because I wasn't running my machines. . .[/QUOTE]

This is what I guess was happened.
On 15[SUP]th[/SUP] November, someone enter a number of form b[SUP]n[/SUP] with b<20000 and n<1000 or somewhere around that ballpark. Since then DB slowly determined whether each number is prime, and while doing so produced a factor which is product of small prime. When that factor is more than 19 digit it won't get factor more. (I think the DB automatically does this for all unknown number less than 20,000 digits). This is where those small composite came from.

The problem is there might be up to 10^7 unknown number waiting in queue, according to [URL="http://factordb.com/distribution.php?size=3&start=0&lim=20000"]Distribution of numbers by digits[/URL]. Those will continue to produce small composite for quite a while.

penlu 2020-11-20 02:51

I can't get enough numbers to run into the page request limit. Instead I've been running into the "IDs created" limit. Submitting a factorization of a number into a product of many small (< 10000) primes seems to create many IDs; I don't understand why.


All times are UTC. The time now is 10:23.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.