mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   FactorDB (https://www.mersenneforum.org/forumdisplay.php?f=94)
-   -   Factoring small composites (https://www.mersenneforum.org/showthread.php?t=24411)

hansl 2019-05-09 04:34

Factoring small composites
 
I was considering trying to factor some composites from the Downloads -> "List of 1.000 randomly chosen, small composite numbers"
Is this useful to the overall purpose of factordb to fully factor these small composites?

Also is there any particular script or program I can use to only evaluate the expressions created by this, and spit them into a simple file of decimal numbers by line, so I can more easily test with various programs?

DukeBG 2019-05-09 20:11

[QUOTE=hansl;516193]I was considering trying to factor some composites from the Downloads -> "List of 1.000 randomly chosen, small composite numbers"
Is this useful to the overall purpose of factordb to fully factor these small composites?[/QUOTE]
They end up factored by someone eventually anyway, kinda.

Nobody is specifically waiting for them – and if someone is, they are wrong to do so, because factordb isn't and shouldn't be a treated as a free factoring service. Anyone is better off factoring numbers themselves than submitting them to FDB and waiting for them to be factored.

Unfortunately, it kinda gets used like that by whoever since it's so open. There's a lot of "garbage" composite numbers that people might've not even submitted consciously, but just looked up some formulae.

I've downloaded that list right now and i see it has 91 and 92-digit numbers. You can see how many total there are [URL="http://factordb.com/stat_1.php"]here[/URL].

[QUOTE]Also is there any particular script or program I can use to only evaluate the expressions created by this, and spit them into a simple file of decimal numbers by line, so I can more easily test with various programs?[/QUOTE]

Generally you wouldn't need to – programs like yafu (which I feel is most suitable for this task) can take the expressions as input no problem.

LaurV 2019-05-10 05:15

There is a "yoyo" perl script somewhere, who does a wonderful job. This script takes a random composite from the db and factors it using yafu (or other external tool) then it reports the result to the db. It is nice in sense that it is "set it and forget it", and you can also specify the desired digit and from where the random composite is taken (like "get a 120-digits composite from the smallest 100 numbers which are 120-digit composites). The "randomness" is to avoid duplication of work - it still happens sometime because people want to factor "the smallest composite available" and fdb does not have an assignment procedure, but the probability is low. Of course, you must have installed some perl, and some factoring tool, like yafu.

You can search the forum for yoyo.pl or so, and if you can't find it, I will post it soon when I get home (I do not have it here at job).

DukeBG 2019-05-10 06:56

Small correction. FDB doesn't actually have an ordinality of numbers with the same digit count (by, erm, the numbers themselves). All the outputs in the same digit size are arbitrarily random. So you cannot get "smallest" 120-digits, for example. However, they are always essentially in the same order – the order they are read from the DB. So those scripts randomize that – the "page" from which the numbers are taken.

yoyo 2019-05-10 10:30

[QUOTE=LaurV;516354]There is a "yoyo" perl script somewhere, who does a wonderful job. This script takes a random composite from the db and factors it using yafu (or other external tool) then it reports the result to the db. It is nice in sense that it is "set it and forget it", and you can also specify the desired digit and from where the random composite is taken (like "get a 120-digits composite from the smallest 100 numbers which are 120-digit composites). The "randomness" is to avoid duplication of work - it still happens sometime because people want to factor "the smallest composite available" and fdb does not have an assignment procedure, but the probability is low. Of course, you must have installed some perl, and some factoring tool, like yafu.

You can search the forum for yoyo.pl or so, and if you can't find it, I will post it soon when I get home (I do not have it here at job).[/QUOTE]

-> [url]https://www.rechenkraft.net/wiki/Benutzer_Diskussion:Yoyo/factordb#yafu.pl[/url]

chris2be8 2019-05-10 16:02

It's at [URL]http://mersenneforum.org/showthread.php?t=19232&page=3[/URL] post 25.

Note you must change factorization.ath.cx to factordb.com since the old DNS entry doesn't work now.

Chris

chris2be8 2020-11-01 17:31

[QUOTE=unconnected;561779]Looks like someone flooded factordb with thousands of small composites.
[url]http://factordb.com/stat_1.php[/url][/QUOTE]

I'm working on them. But it's like painting the Forth Bridge. And starting to get annoying.

[quote]
One fool can ask more questions than a thousand wise men can answer.
[/quote]

Though I don't think that alone would slow down sequence processing.

Chris

garambois 2020-11-01 19:43

Page updated, but only for bases 2, 3, 20, 21, 23, 162, 439, 496.
31^36 and complete base 385 are reserved for me.

Thanks to all for your help !
Please check if all your requests have been taken into account.

The other bases will be updated in the next few weeks, as it takes a lot of time for each base .

EdH 2020-11-02 13:18

Note: Post was edited to remove thread irrelevant content from this copied post. Original content can be found [URL="https://www.mersenneforum.org/showpost.php?p=561927&postcount=663"]here[/URL].

@chris2be8: I wonder if the large number of small composites on factordb was just a result of the rebuild. I did factor a few thousand, but quickly got close to the hour's limit for a couple values. They seemed to be clearing pretty fast at the time, so I left the rest alone.

chris2be8 2020-11-02 17:13

I don't think the small composites are from the rebuild. I've seen quite a few numbers like:
[code]
(738468*49##+557)/35840141013305049257323
(2^295+94923)/13352559799576307
[/code]

So it's general junk, not components of sequences. I've been tackling the ones that can be done by SNFS in the 80-90 digit range as well as everything from 70-79 digits. But don't hold your breath waiting for me to clear them all.

Chris

EdH 2020-11-02 17:37

[QUOTE=chris2be8;561962]I don't think the small composites are from the rebuild. I've seen quite a few numbers like:
[code]
(738468*49##+557)/35840141013305049257323
(2^295+94923)/13352559799576307
[/code]So it's general junk, not components of sequences. I've been tackling the ones that can be done by SNFS in the 80-90 digit range as well as everything from 70-79 digits. But don't hold your breath waiting for me to clear them all.

Chris[/QUOTE]
I had done about 50k, 90 dd, but noticed the count fell much more than my 50k, so I figured someone with some computing power was working there. I have since trimmed my work down considerabley and moved it to 97 dd. I'm actually factoring more than are showing up, which is reassuring, even if it is pretty low ATM compared to sometimes. If you're only working to 90 dd, maybe I'll move down to 91 when I throw some work their way.

EdH 2020-11-02 19:28

Resurrecting this thread. . .
 
I'm going to bring this thread back to life to address the ongoing (elves) work for the db.

I referenced in an aliquot thread my current work with small fdb composites and I'd like to move that discussion to this thread. If Chris2be8 is agreeable I may move our relevant posts from there to here later on.

Currently I am running a couple Colab instances at 91 dd, but they are under-performing. Evaluating the backlog for even the 7x digit composites, I'm thinking of lowering my work down into that area. I'm using a pretty large random, so my duplication rate will hopefully be minimal. If anyone would like me to move higher and leave some "smaller fruit," don't be shy letting me know. I could easily work at 8x dd or higher, but I'm running ECM with some RPi's and a couple of my Colab sessions; one with YAFU and the other using the GPU branch of GMP-ECM and Msieve.

EdH 2020-11-03 19:12

Base 162 table should be all "greened up" now.

I've started "touching" all the sequences for the tables and it took a couple hours to do five tables this morning. I have started a script to run all of the tables except 220 and 284, since they're just being initialized and probably won't need any "help." I'm curious how long the script will actually take to get through everything.

EdH 2020-11-04 13:53

I was greeted this morning with the finished script, so every sequence in the current tables and base 72, has been accessed. Hopefully, that will get the update time back near normal.

richs 2020-11-04 14:22

I have had one core of an i3 running the perl script utilizing yafu for the past 6 months or so. It is set to start at C81 and to choose one of the first 1,000 composites, so I have not had much problem with worker collisions.

EdH 2020-11-04 15:47

[QUOTE=richs;562182]I have had one core of an i3 running the perl script utilizing yafu for the past 6 months or so. It is set to start at C81 and to choose one of the first 1,000 composites, so I have not had much problem with worker collisions.[/QUOTE]
I've got everything running from 76 upward with 1000 random as well. I have three RPi's running 24/7, a Core2 Duo running about 11hr/day and I've been running a couple Colab YAFU sessions per day recently. At this size the Colab sessions are doing better than 1000 per day, each.

I suppose I will catch up to your size pretty soon. If you do notice collisions, mention it here and I'll hop a little higher. I might do that anyway. . .

chris2be8 2020-11-04 16:56

@EdH, you are welcome to move my last 2 posts from the aliquot thread here, I was only using it as a place to let off steam about the amount of junk being added to factordb.

I've got one system working from 70-79 digits, it grabs the smallest number in that range, factors it and repeats. So if you want to work in that range taking an offset of 10+rand(1000) would make absolutely sure you would not collide with me.

I'm also working on 80+ digit numbers that can be factored by SNFS. I'm currently working around 84 digits (I was at 90 digits until recently). These are spaced out enough to make collisions unlikely.

And thanks for the help.

Chris

EdH 2020-11-04 18:50

[QUOTE=chris2be8;562194]@EdH, you are welcome to move my last 2 posts from the aliquot thread here, I was only using it as a place to let off steam about the amount of junk being added to factordb.

I've got one system working from 70-79 digits, it grabs the smallest number in that range, factors it and repeats. So if you want to work in that range taking an offset of 10+rand(1000) would make absolutely sure you would not collide with me.

I'm also working on 80+ digit numbers that can be factored by SNFS. I'm currently working around 84 digits (I was at 90 digits until recently). These are spaced out enough to make collisions unlikely.

And thanks for the help.

Chris[/QUOTE]Thanks Chris, The posts have been moved (and, mine copied/edited) and I have added the offset to all my local machines and the two Colab setups.

I'm looking at the feasibility of adding some scripts to my main ecmpi/CADO-NFS clients, to use them as elves when the CADO-NFS servers are doing LA. But, I'm not sure if/when I may do that. I have been away from elf work because I got really frustrated by the huge composite dump that happened right after we had cleared out the backlog through 100 (or, maybe it was even 110) digits. I'm a little encouraged ATM, with the actual overall number of composites through 120 showing a decrease instead of still growing.

EdH 2020-11-15 19:41

>40,000 composites added to the <70 queue?

@Chris: Do I remember you helping out the db when it gets this type of flooding? Should I turn a Colab session against them? Or, will this be a small task for the local db elves? I hesitate turning some machines from my "farm" against them because I think I'll hit my db limits.

warachwe 2020-11-15 21:16

I notice that most of the new number is of the form a^n+-1 for 10001<=a<=20000 and n somewhere around 20. Most of these numbers already factored at cownoise.com . Is there anyway to just transfered them over?

EdH 2020-11-15 21:31

[QUOTE=warachwe;563326]I notice that most of the new number is of the form a^n+-1 for 10001<=a<=20000 and n somewhere around 20. Most of these numbers already factored at cownoise.com . Is there anyway to just transfered them over?[/QUOTE]
I think a script to transfer them would hit the db limits, but one could be written. All it would need to do is submit the composite with its factors via the report routine. But, every one of the new factors would create a new ID which would bump the limit after a short while.

unconnected 2020-11-16 16:42

A bunch of new small composites just arrived :smile: :confused2:

I wonder what is the purpose of adding them to FDB - hoping the elves will factor them?

chris2be8 2020-11-16 17:03

I've started running a script to add algebraic factors where they exist to factordb. I'm currently working from 88 to 90 digits. Which should thin down the junk a bit.

But this is getting rather annoying.

Chris

EdH 2020-11-16 17:33

This is what frustrates me away from providing my external elves. Currently, I'm hitting back pretty hard with a bunch of systems running from 70dd(+10) upwards. I'm showing a rate of a little over 6k/hr factored. I also have a couple Colab sessions working the same region.

In addition, I've got some RPi's and a separate Colab session working in the 40dd range to see what that does.

EdH 2020-11-16 17:46

Actually, it looks like for >69dd, we've caught back up to yesterday.

Am I treading on your algebraic factor work, Chris?

EdH 2020-11-16 19:43

It's gone to >1M <70 digit composites!:max:

RichD 2020-11-17 02:24

I have been running in the 40dd and 50dd to find where my C2D laptop won't bust the hourly limits. It seems these number all have at least 10-12 or more factors!

EdH 2020-11-17 03:29

[QUOTE=RichD;563440]I have been running in the 40dd and 50dd to find where my C2D laptop won't bust the hourly limits. It seems these number all have at least 10-12 or more factors![/QUOTE]That explains why my Colab sessions weren't factoring 100%. I didn't look into why, but either YAFU wasn't returning factors less than 5 or 6 digits in size or my Python code wasn't retrieving them.

EdH 2020-11-17 17:09

[QUOTE=RichD;563440]I have been running in the 40dd and 50dd to find where my C2D laptop won't bust the hourly limits. It seems these number all have at least 10-12 or more factors![/QUOTE]
Thank you! Because of what you posted, I revisited my RPi's that are only doing ECM. They had been registering <50% success in factoring, which seemed odd. I was starting the ECM runs with B1=11e2 and stopping with 11e5. I changed my start B1 to 11e1 and now I'm showing better than 90% success with composites at 10 digits + 100 (offset).

chris2be8 2020-11-17 17:18

[QUOTE=EdH;563410]
Am I treading on your algebraic factor work, Chris?[/QUOTE]

No, I've stopped that (there don't seem to be any left in this range).

Looking at a few in the 70+ digit range most of them seem to be from factordb running ECM against a large number and finding a composite factor (the product of most/all of the small factors). Just click "More information" and see what it says it's a factor of. It's probably true for the under 70 digit range as well.

I'm currently adding algebraic factors to numbers with status unknown. At present they nearly all seem to have algebraic factors. Just look at [url]http://factordb.com/listtype.php?t=2[/url]

And I think the limits have been raised recently. Which is just as well given my 1 PC working from 70 to 79 digits seems to be factoring about 1 number per second!

Chris

chris2be8 2020-11-17 17:22

If you have GP installed try something like:
[code]
echo 'factor($n,10^5)' | gp -f -q
[/code]

That TFs a number up to 10^5 which is a much better way to split off small factors than running ECM.

Chris

EdH 2020-11-17 17:59

[QUOTE=chris2be8;563523]If you have GP installed try something like:
[code]
echo 'factor($n,10^5)' | gp -f -q
[/code]That TFs a number up to 10^5 which is a much better way to split off small factors than running ECM.

Chris[/QUOTE]
Thanks! I'll definitely try to set that up on my RPi machines.

But, my "farm" machines have almost all crashed! They were working 70dd+10 and they're all coming up with gibberish:
[code]
href="res.php">limits</a>) already factored
Error, no result found
Factoring 3 digits: (<a
compCheck 100% 127KB 9.8MB/s 00:00
(<a already factored
(<a already factored
Error, no result found
Factoring 28 digits: href="imp.html">Imprint</a>)
compCheck 100% 127KB 9.7MB/s 00:00
href="imp.html">Imprint</a>) already factored
Error, no result found
Factoring 3 digits: (<a
compCheck 100% 127KB 9.5MB/s 00:00
(<a already factored
(<a already factored
Error, no result found
Factoring 31 digits: href="datenschutz.html">Privacy
compCheck 100% 127KB 9.3MB/s 00:00
href="datenschutz.html">Privacy already factored
Error, no result found
Factoring 11 digits: Policy</a>)
compCheck 100% 127KB 9.6MB/s 00:00
Policy</a>) already factored
Error, no result found
Factoring 0 digits:
compCheck 100% 127KB 9.6MB/s 00:00
already factored
Error, no result found
[/code](The compCheck is a file that contains a list of all the composites that any of the machines has factored, to minimize duplicates. I have to trim it often.)

I had to take the entire set down for now.:sad:

EdH 2020-11-17 19:30

Well, I've given up for now! I can't make my "farm" machines keep running for some reason. They all start out fine, but then spew junk after a few iterations. I guess the overwhelming small factors are somehow wrecking something. . .

I have the RPi's working in the 79dd+10 range, but nothing else. No Colab instances, either.

EdH 2020-11-18 03:47

[QUOTE=chris2be8;563523]If you have GP installed try something like:
[code]
echo 'factor($n,10^5)' | gp -f -q
[/code]That TFs a number up to 10^5 which is a much better way to split off small factors than running ECM.

Chris[/QUOTE]
I implemented this with 10^7 (because 10^5 and 10^6 failed several) and it is flying at just over 1 sec/composite! Thanks!

However, I've done about 7000, 25dd composites and at the same time, the count of 25dd composites rose from ~116k to ~163k. That seems to be in the wrong direction. And, my other machines are still offline.

Happy5214 2020-11-18 08:29

Do you have a script (Linux) I can borrow? I might put a core or two on this in a day or so. Seems more helpful than the low-end aliquot sequences I'm working on now.

penlu 2020-11-18 11:05

Here is some python: [URL="https://github.com/penlu/factoring"]https://github.com/penlu/factoring[/URL]

The code is messy and variously suboptimal. Let me know if something is broken or if you need additional information to get it working.

To avoid repeatedly poking disk with msieve's save files, I run it as:
[CODE]TMPDIR=/dev/shm python worker2.py[/CODE]

I have a machine working at 20dd, offset 40000. This script is using rather less than one core. The bottleneck for such small composites appears to be communication with factordb: it allows us four parallel requests per IP. The script above uses two requests per number: one to query its ID and one to submit its factorization. With the response time I get from factordb, this apparently works out to some 250 factored per minute. This is maybe about the throughput the script will manage per IP address it is run on.

EdH 2020-11-18 14:49

[QUOTE=Happy5214;563620]Do you have a script (Linux) I can borrow? I might put a core or two on this in a day or so. Seems more helpful than the low-end aliquot sequences I'm working on now.[/QUOTE]
Here's a somewhat simple bash script:
[code]
#!/bin/bash/################################
# Note that this script uses two temporary
# files: dbTemp and dbSuccess. It will alter
# and delete previous files with these names.
# This script is single threaded. If using
# multiple instances, you should keep track
# of the hourly limits for factordb.
############################################

digitsize=25
totalrun=5

printf "Factoring $totalrun composites:\n\n"
for ((n=0;n<totalrun;n++))
do
startt=SECONDS
randomnumber=$(echo $((10 + RANDOM % 1000)))
wget "http://factordb.com/listtype.php?t=3&mindig=${digitsize}&perpage=1&start=$randomnumber&download=1" -q -O dbTemp
exec <"dbTemp"
read composite in
echo "Composite $((${n}+1)) of $totalrun is $composite <${#composite}>"
temp=$(echo "factor($composite, 10^7)" | gp -f -q)
temp=$(echo "${temp//\n/}" | xargs)
temp=$(echo "${temp// 1]/}")
temp=$(echo "${temp//]/}")
temp=$(echo "${temp//[/*}")
temp=$(echo "${temp//' '/}")
printf "Factors are ${temp:1} <$((${SECONDS}-${startt}))s>\n\n"
returnfactors=${composite}%3D${temp:1}
wget "http://factordb.com/report.php?report=$returnfactors" -q -O dbSuccess
done
echo "Total time for $n composites was $SECONDS seconds."
rm dbTemp
[/code]Let me know of any programming flaws or troubles.

chris2be8 2020-11-18 17:17

[QUOTE=EdH;563530]Thanks! I'll definitely try to set that up on my RPi machines.

But, my "farm" machines have almost all crashed! They were working 70dd+10 and they're all coming up with gibberish:
[code]
href="res.php">limits</a>) already factored
Error, no result found
Factoring 3 digits: (<a
compCheck 100% 127KB 9.8MB/s 00:00
(<a already factored
(<a already factored
Error, no result found
Factoring 28 digits: href="imp.html">Imprint</a>)
compCheck 100% 127KB 9.7MB/s 00:00
href="imp.html">Imprint</a>) already factored
Error, no result found
Factoring 3 digits: (<a
compCheck 100% 127KB 9.5MB/s 00:00
(<a already factored
(<a already factored
Error, no result found
Factoring 31 digits: href="datenschutz.html">Privacy
compCheck 100% 127KB 9.3MB/s 00:00
href="datenschutz.html">Privacy already factored
Error, no result found
Factoring 11 digits: Policy</a>)
compCheck 100% 127KB 9.6MB/s 00:00
Policy</a>) already factored
Error, no result found
Factoring 0 digits:
compCheck 100% 127KB 9.6MB/s 00:00
already factored
Error, no result found
[/code](The compCheck is a file that contains a list of all the composites that any of the machines has factored, to minimize duplicates. I have to trim it often.)

I had to take the entire set down for now.:sad:[/QUOTE]

To judge by the bits of html factordb is sending a web page instead of a number to factor. Which might well be saying you have reached a limit on how much you can do in an hour. The limits are by IP address connecting to factordb so are shared by everything at that IP address.

I've had to update my scripts to wait for a few minutes before trying again when they see something like that. And dump whatever they received to a log where I can see it.

I've also seen messages like:
[code]
1040: SQLSTATE[HY000] [1040] Too many connections
[/code]

Which is an internal problem in factordb.

Chris

EdH 2020-11-18 20:19

Actually, I wasn't anywhere near the hourly limits, but I think what I hit was, "You have reached the maximum of 4 parallel processing requests ([URL="http://www.factordb.com/help.php?page=2"]?[/URL]). Please wait a few seconds and try again." I had so many machines that were all factoring so fast, they hit the limit of 4 requests at once. And, other machines kept accessing fine, but the affected ones wouldn't correct themselves once they started failing.

I'm working on a bash script, based on the one posted, that runs the gp function first, and then runs YAFU, if factors are not found. YAFU doesn't seem to be giving me the really small factors (in factor.log) for all these multiple prime composites.

I'm hoping to build in checking for a composite, because the Python script I was using couldn't seem to fix itself. For my own use, I also have a machine that tracks composites that have been started so they "shouldn't" be started by another machine before being finished by the first. That's where the "already factored" comes from. In this case it was actually referring to the HTML messages as triggering that one.


Edit: P.S. Thanks again for the gp code.

VBCurtis 2020-11-18 22:25

[QUOTE=EdH;563662]Here's a somewhat simple bash script:
...
Let me know of any programming flaws or troubles.[/QUOTE]

I am now an elf! Your script prompted me to install pari-gp, and I'm running it on 27-digit composites. Not sure how long I'll leave it going, but at 26 and 27 digits most of the numbers split into 3 9-digit factors; so I changed the factor bound to 10^9.

Maybe I'll try the python script tomorrow, so I can do "real" elf work.

EdH 2020-11-18 22:35

[QUOTE=VBCurtis;563704]I am now an elf! Your script prompted me to install pari-gp, and I'm running it on 27-digit composites. Not sure how long I'll leave it going, but at 26 and 27 digits most of the numbers split into 3 9-digit factors; so I changed the factor bound to 10^9.

Maybe I'll try the python script tomorrow, so I can do "real" elf work.[/QUOTE]
I'm happy to be inspirational. I have a script now that invokes YAFU after it runs the gp part as a test for smaller primes. I have it running on several machines right now for testing at 70 dd + 10. I intend to provide it as a, "How I. . ." thread once I polish it a bit. I think I'll make it a Colab session first and then consider a desktop version.

EdH 2020-11-19 18:26

Well, Bummer! I ran into the "maximum of 4 parallel processing requests" mentioned before, in a too big way. Because of my scripts running factoring continuously, once this occurred, the machines kept trying until all my machines got involved whenever each one completed its current factorization. This quickly overran the "Page requests" limit and effectively knocked my whole "farm" out, until I told them all to stop. I'm implementing some code to (hopefully) keep this from happening again (or, at least often).

The above trouble shouldn't affect the script I posted earlier, since I expect no one to run it on more than a few threads, but let me know if I should edit it as well.

Now, what happened? I thought everything below 70 digits was gone, but now 75k have shown back up while I was locked out. I'm not egotistical enough to think they appeared because I wasn't running my machines. . .

warachwe 2020-11-20 01:47

[QUOTE=EdH;563778]
Now, what happened? I thought everything below 70 digits was gone, but now 75k have shown back up while I was locked out. I'm not egotistical enough to think they appeared because I wasn't running my machines. . .[/QUOTE]

This is what I guess was happened.
On 15[SUP]th[/SUP] November, someone enter a number of form b[SUP]n[/SUP] with b<20000 and n<1000 or somewhere around that ballpark. Since then DB slowly determined whether each number is prime, and while doing so produced a factor which is product of small prime. When that factor is more than 19 digit it won't get factor more. (I think the DB automatically does this for all unknown number less than 20,000 digits). This is where those small composite came from.

The problem is there might be up to 10^7 unknown number waiting in queue, according to [URL="http://factordb.com/distribution.php?size=3&start=0&lim=20000"]Distribution of numbers by digits[/URL]. Those will continue to produce small composite for quite a while.

penlu 2020-11-20 02:51

I can't get enough numbers to run into the page request limit. Instead I've been running into the "IDs created" limit. Submitting a factorization of a number into a product of many small (< 10000) primes seems to create many IDs; I don't understand why.

EdH 2020-11-20 14:42

[QUOTE=warachwe;563806]This is what I guess was happened.
On 15[SUP]th[/SUP] November, someone enter a number of form b[SUP]n[/SUP] with b<20000 and n<1000 or somewhere around that ballpark. Since then DB slowly determined whether each number is prime, and while doing so produced a factor which is product of small prime. When that factor is more than 19 digit it won't get factor more. (I think the DB automatically does this for all unknown number less than 20,000 digits). This is where those small composite came from.

The problem is there might be up to 10^7 unknown number waiting in queue, according to [URL="http://factordb.com/distribution.php?size=3&start=0&lim=20000"]Distribution of numbers by digits[/URL]. Those will continue to produce small composite for quite a while.[/QUOTE]2.2M <70 when I checked a few minutes ago. Why would the db not pull out p1-p10 primes first if it was breaking down larger composites? Almost all of these are made up of <10 digit primes.

[QUOTE=penlu;563809]I can't get enough numbers to run into the page request limit. Instead I've been running into the "IDs created" limit. Submitting a factorization of a number into a product of many small (< 10000) primes seems to create many IDs; I don't understand why.[/QUOTE]
Any time a "never before seen" number arrives, it is added to the db and generates a new ID. It does seem odd that many of these small composites are composed of primes not already known. I would think this would eventually not be the case, unless the db handles small primes in a different manner from others. But, the IDs for primes <19 are the prime itself. Perhaps these primes are not kept which forces a new generation flag every time it shows up.

On a somewhat related issue, can anyone untar the .bz2 files generated on the Downloads page? I tried [URL="http://www.factordb.com/dlc.php?n=1000"]List of 1.000 randomly chosen, small composite numbers[/URL] and I can't open the file with anything. I just get errors.

Similarly, can several factorizations be uploaded at once instead of individually?

kruoli 2020-11-20 15:15

The .bz2 file is not in tar format. It is pure bzip2. 7-zip opens it just fine!

[URL="http://factordb.com/report.php"]The report page[/URL] enables you to upload multiple things at once; I'm not quite sure about the format. I guess something like [C]number = factor1 * ... * factorN[/C], one number per line?

EdH 2020-11-20 15:45

[QUOTE=kruoli;563837]The .bz2 file is not in tar format. It is pure bzip2. 7-zip opens it just fine!

[URL="http://factordb.com/report.php"]The report page[/URL] enables you to upload multiple things at once; I'm not quite sure about the format. I guess something like [C]number = factor1 * ... * factorN[/C], one number per line?[/QUOTE]
Well, that's embarrassing! It just so happens all my Ubuntu machines open it fine (why didn't I try any?). My main (Fedora) machine doesn't seem to have any 7-zip program. More study is needed.

As to the report, I haven't tried a multiple line upload yet, but I hope to soon.

Thanks!

chris2be8 2020-11-20 17:04

There have been a load of a few thousand digit numbers added to factordb recently. When factordb tries to factor them it runs ECM, which often finds a composite factor which is the product of most of the smallish factors of the number. That's probably the source of most of the small numbers with many factors.

And the reason why there are over 121000 PRPs under 3000 digits!

Try picking a few and clicking "More information" to see what it's a factor of and following the chain. That should show you where stuff is being added now.

Chris

penlu 2020-11-20 20:01

[QUOTE=EdH;563838]
As to the report, I haven't tried a multiple line upload yet, but I hope to soon.
[/QUOTE]

This is how I upload my results, but I advise not to do this because you might create many new IDs -- I set my stuff to upload only the largest factor found.

I realize this probably increases the load on factordb since for small composites the cofactor is often under the 19-digit autofactor threshold. But it seems to result in a higher clearance rate. The incentive is a little perverse?

EdH 2020-11-20 22:57

[QUOTE=chris2be8;563846]There have been a load of a few thousand digit numbers added to factordb recently. When factordb tries to factor them it runs ECM, which often finds a composite factor which is the product of most of the smallish factors of the number. That's probably the source of most of the small numbers with many factors.

And the reason why there are over 121000 PRPs under 3000 digits!

Try picking a few and clicking "More information" to see what it's a factor of and following the chain. That should show you where stuff is being added now.

Chris[/QUOTE]At least the explanations provide an acceptable reason for the huge inpouring. The PRPs I hadn't noticed! Might be time to shift in that direction for a while. I'll have to think about that. I had a set of scripts to work through a range across several machines, but I'd have to figure out which one was the server and how I ran them. I think I had an RPi acting as server.
[QUOTE=penlu;563862]This is how I upload my results, but I advise not to do this because you might create many new IDs -- I set my stuff to upload only the largest factor found.

I realize this probably increases the load on factordb since for small composites the cofactor is often under the 19-digit autofactor threshold. But it seems to result in a higher clearance rate. The incentive is a little perverse?[/QUOTE]I used to do the same - only return the largest prime. But, currently I'm sending the whole set of found factors back. I did just hit the limit, though, so I might have to rework my scripts to only the largest again.

I'm also running Colab sessions and don't worry about them because they are only single threaded and unique IP'ed. All of my scripts presently do the full circle of retrieve composite - factor - return results - repeat. But, I plan to look into retrieving several, factoring all and returning all at once. I'm not sure there's a method for larger composites, though. They may have to be retrieved one at a time if you want randomness.

chris2be8 2020-11-21 16:56

[QUOTE=EdH;563892]I plan to look into retrieving several, factoring all and returning all at once.
[/QUOTE]

That would reduce the number of page requests per hour, but increase the risk of a collision. I'd only recommend it working well below 70 digits, in a range no one else is working on. And keep the number requested fairly small, no more that 10.

But several Colab sessions working at once sounds very nice. Thank you.

Chris

EdH 2020-11-21 17:31

[QUOTE=chris2be8;563956]That would reduce the number of page requests per hour, but increase the risk of a collision. I'd only recommend it working well below 70 digits, in a range no one else is working on. And keep the number requested fairly small, no more that 10.

But several Colab sessions working at once sounds very nice. Thank you.

Chris[/QUOTE]
Yeah, I've been shying away from the bulk runs for the reason you gave.

But, I have added a Colab PARI/GP thread, based on the earlier posted script* to the forum at:

[URL="https://www.mersenneforum.org/showthread.php?t=26212"]How I Create a Colab Session That Factors factordb Small Composites with PARI/GP[/URL]

I have some of these running with larger "totalrun" values. Being single-threaded, they don't seem to be bumping the limits. But, I added number checking to (hopefully) alleviate any such overruns.

*If anyone is running the earlier posted (in this thread) script, they should copy it again. I found and fixed a bug in the original script and edited it in the original post.

EdH 2020-11-23 22:43

I factored just over 12k small primes out of the digit sizes for 92 through 97 so far today. Despite that 12k composites were removed, the region GREW by nearly 3k!

However, <91 briefly looks better. . .

richs 2020-11-24 02:53

I ran another i3 core on the C28's yesterday and factored about 12k also.

richs 2020-11-25 23:31

Restarted another i3 core on C28's.


All times are UTC. The time now is 12:08.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.