mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > Aliquot Sequences

Reply
 
Thread Tools
Old 2012-07-10, 07:28   #1
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3·29·83 Posts
Default The Appallingly Blue Page

Hey everybody.

I'm rather pleased to announce that I have created a spider to automatically update Aliquot sequence statuses, and output it in a really pretty HTML table.

http://.../aliquot/AllSeq.html EDIT: outdated

http://www.rechenkraft.net/aliquot/AllSeq.html

Of course, the reason it's not been done before is strain on the FactorDB. Thus my spider is run once per hour, updates 55 sequences, then exits, saving its location for the next hour. This means that each sequence gets updated once over the course of a week, at which point it restarts at the beginning and does it again. The idea, of course, is that no sequence is more than a week out of date.

In addition, not only can I update sequences, I can also check reservations here on MersenneForum (and the page even shows the last update to the reservations). That's why I asked about subproject reservations. For the moment then, the table only shows reservations from the main reservation thread, but there are two ways to fix that: 1) Put all subproject reservations in the lead post of the main reservation thread, or (probably easier for teh mods) 2) format the leading subproject posts in the same manner as the main thread lead post.

And finally, though certainly not least, I found an excellent project called DataTables that uses JavaScript to allow for efficient sorting and searching of large tables. There's more detail on the page, so I encourage you to read it.

There are two major weaknesses: first, I can't do much about FDB sequence errors, though if you look there is a list of known wrong sequences. Second, this of course requires workers to upload their work to the FDB, though AFAICT this has essentially become the norm. I believe Paul Zimmermann does it semi-regularly, and though Christophe Clavier doesn't update the FDB himself, he does make .elf files available for all his sequences. I created a spider that runs weekly to (as necessary) upload his work to the FDB. Thus the only major work that might be missing is Clifford Stern's, though it's certainly possible that he is still updating the FDB.

To go along with it, I've created a basic statistics page with very simple stats; it looks kind of silly now, but that's still under construction.

I don't have much left I can think of to add; the last thing on my list was to add the very simple code to pick out drivers and display that just in front of the main "Factors" column. I welcome any suggestions.


What do you think?


Last fiddled with by Batalov on 2017-11-09 at 15:41 Reason: edited the link, as it appears to be hijacked for a long time now
Dubslow is offline   Reply With Quote
Old 2012-07-10, 08:02   #2
kar_bon
 
kar_bon's Avatar
 
Mar 2006
Germany

2×3×11×43 Posts
Default

Here's any idea to think of:

I don't know how you download the last line of a sequence (perhaps like this).

If so, you could run a comparison over all open seqs to find merges = last line same.
Although the timeframe all seqs are downloaded (55 per hour) is big, not all merges would be found if someone worked on such seq. in the meanwhile.

This feature is not yet implemented in the FactorDB as it was long time ago.

The FDB restriction are the thing which let me gave up to update my pages:
I've done such last-line-download (on a quad core with 4 threads took about an hour) and made my pages via script, finding merges and new terminations in seconds.

Last fiddled with by kar_bon on 2012-07-10 at 08:04
kar_bon is offline   Reply With Quote
Old 2012-07-10, 08:18   #3
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3×29×83 Posts
Default

It's fairly easy to query and pause, query and pause like I've done.

I do use "action=last", but as it is the spider doesn't record the actual value/id of the large numbers, just their lengths. It would be fairly easy to modify, but I'd only want to run such a merge-finder once a week, after each complete refresh. Now that I think about it though, such a thing would be easier than I was thinking 15 minutes ago. (When you mentioned it, I was initially thinking like in the "Genealogy" thread, which requires much more logic. )

I actually got the idea for the stats page from your website. I would love to help you in any way possible. My script is ~200+ lines of Python (plus a few hundred more lines of HTML templates). How did you update your site?
Dubslow is offline   Reply With Quote
Old 2012-07-10, 08:33   #4
kar_bon
 
kar_bon's Avatar
 
Mar 2006
Germany

54268 Posts
Default

- First made a file with all open seqs
- creating 4 files downloading all last lines with wget
- downloading in 4 threads all last lines in 4 folders
- processing the last-line-files with awk script to get lines like this (old ones):
Code:
   276 U  1687. 3678759348...6<165> = 2 * 3^2 * 7^2 * 53 * 7869677296...1<160>
   552 U  1057. 4238228081...6<179> = 2^2 * 3 * 71 * 145633 * 3415741009...1<171>
   564 U  3357. 2239382335...8<172> = 2^2 * 7 * 31 * 103 * 6211 * 26557 * 1499962302625458296587675861761081389<37> * 1012395977...9<123>
   660 U   890. 2345292265...0<181> = 2^3 * 3^2 * 5 * 6514700736...9<178>
   966 U   893. 8491715927...0<178> = 2^2 * 3^2 * 5 * 83 * 2099 * 2707898746...9<171>
Here 'U' stands for 'Unchecked' in the FactorDB. 'P' would be prime so terminated.

- running another awk-script to do the html-pages (reservations were read from another file)
- running awk-script to make stats like this:
Code:
Counting OES per 100k-ranges:
000k 100k 200k 300k 400k 500k 600k 700k 800k 900k 
 902  953  918  855  889  951  939  927  959  960  9253

Counting OES-lengths:
    000k     100k     200k     300k     400k     500k     600k     700k     800k     900k 
 1323644  1342120  1250274  1121133  1209676  1283284  1238300  1262357  1272105  1257328   12560221   1357.421

Counting OES-sizes:
  000k   100k   200k   300k   400k   500k   600k   700k   800k   900k 
108573 110816 105032  97341 100646 108090 105993 105027 108294 108318  1058130   114.355
- running a sort-tool (CMsort) for finding merges and terminations
- running awk-script for small queries:
Code:
type=1: all Seqs of range r1 to r2
type=2: all Seqs <400 lines (-> Project 3b)
type=3: all Seqs 150000<n<200000, index<110 (-> Project 9)
type=4: all Seqs 100000<n<150000, index<110 (-> Project 7)
type=5: all Seqs length<100 digits of last index
type=6: all Seqs length<100 of composite
So the whole work was done in a little bit more than 1 hour, all data were 'just in time'.
kar_bon is offline   Reply With Quote
Old 2012-07-10, 21:52   #5
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT)

2·5·569 Posts
Default

Nice page. I especially like the ways of sorting the table.
Not certain about the colour scheme. It seems a bit in your face to me.
henryzz is offline   Reply With Quote
Old 2012-07-10, 22:17   #6
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3·29·83 Posts
Default

Quote:
Originally Posted by henryzz View Post
Nice page. I especially like the ways of sorting the table.
Not certain about the colour scheme. It seems a bit in your face to me.
I like blue

I'd be more open to suggestions for the table background than the page bg, but I'll listen to anything.

Edit: Why do the 'e' and 'r' keys have to be right next to each other?

Last fiddled with by Dubslow on 2012-07-10 at 22:22
Dubslow is offline   Reply With Quote
Old 2012-07-12, 06:31   #7
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3·29·83 Posts
Default

Okay, I've now added a 'Driver' column.

I have quite a few questions about what you guys want.

--Primarily, though it's called the 'Driver' column, it also lists the guides, as well as defaulting to listing the current power of two if no driver or guide is found.

*Should it keep this behavior?

*Should it not display guides/powers of two?

*Should it display any powers of the non-two factors (e.g. for seeing if a driver is escapable)?

*Should I get rid of the 'Factors' column altogether?


--Additionally, as kar_bon suggested, the script now tracks the FDB ID of the last line, and once a week I run a script to check for any duplicates, i.e. merges. Also like he mentioned, the major flaw is that if a merged pair is updated between when I get the first branch and get the second branch, then the merge won't be detected. (It will take a week for the ID list to be fully populated.) (If you look hard enough, the ID is available for each sequence.)

--Would anybody want for each row/sequence to be a link to its status page?

Last fiddled with by Dubslow on 2012-07-12 at 06:33
Dubslow is offline   Reply With Quote
Old 2012-07-12, 15:04   #8
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

863510 Posts
Default

Very nice job with that table! I almost forgive you for those colors

Stupid question: how the reservations got into that table? you add them by hand? In any case, please add me to 4290 which I am nurturing since it was C126 (see here about it, if I will have some free CPU I will queue the C142)

Last fiddled with by LaurV on 2012-07-12 at 15:09
LaurV is offline   Reply With Quote
Old 2012-07-12, 17:11   #9
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

722110 Posts
Default

Quote:
Originally Posted by LaurV View Post
Very nice job with that table! I almost forgive you for those colors
Again, if anybody suggests a different scheme, I'd probably do it.
Quote:
Originally Posted by LaurV View Post
Stupid question: how the reservations got into that table? you add them by hand? In any case, please add me to 4290 which I am nurturing since it was C126 (see here about it, if I will have some free CPU I will queue the C142)
Quote:
Originally Posted by Dubslow View Post
In addition, not only can I update sequences, I can also check reservations here on MersenneForum (and the page even shows the last update to the reservations). That's why I asked about subproject reservations. For the moment then, the table only shows reservations from the main reservation thread, but there are two ways to fix that: 1) Put all subproject reservations in the lead post of the main reservation thread, or (probably easier for teh mods) 2) format the leading subproject posts in the same manner as the main thread lead post.
Code:
def get_reservations():
     reserves = {}
     req = request.Request('http://www.mersenneforum.org/showpost.php?p=165249&postcount=1',
     # This is the lead post of the main reservations thread
              headers = {'User-Agent': 'Dubslow/AliquotSequences'} )
     page = request.urlopen(req).read().decode('utf-8')
     update = re.search(r'<!-- edit note -->.*Last fiddled with by [A-Za-z_0-9 -]+? on ([0-9a-zA-Z ]+) at <span class="time">([0-9:]{5})</span>', page, flags=re.DOTALL)
     updated = update.group(1)+' '+update.group(2)
     page = re.search(r'<pre.*?>(.*?)</pre>', page, flags=re.DOTALL).group(1)
     for line in page.splitlines():
          herp = re.match(r' {0,3}([0-9]{3,6})  ([0-9A-Za-z_ -]{1,16})', line)
          try:
               name = herp.group(2)
          except: pass
          else:
               if 'jacobs and' in name:
                    name = 'jacobs and Richard Guy'
               reserves[int(herp.group(1))] = name.strip()
     return reserves, updated
You'll notice that the "(as of <date/time>)" note in the column header matches the date/time that the lead post of the main reservations thread was last edited at.

So if you want it to appear, ask the mods (Somehow my reservation of 484470 was missed in the last edit )

PS: Regarding merges, SM_88 directed me to this page, about which I had no idea. Anybody could fairly easily check for merges with it.

Last fiddled with by Dubslow on 2012-07-12 at 17:16
Dubslow is offline   Reply With Quote
Old 2012-07-12, 17:45   #10
kar_bon
 
kar_bon's Avatar
 
Mar 2006
Germany

2·3·11·43 Posts
Default

Quote:
Originally Posted by Dubslow View Post
PS: Regarding merges, SM_88 directed me to this page, about which I had no idea. Anybody could fairly easily check for merges with it.
This page with endings was done only once in March and contains still some error (missing lines and some doubled).
I've just downloaded that page and compared it with the one from March: no changes!
Syd said then, this page will not be updated because of too much DB-accesses.
kar_bon is offline   Reply With Quote
Old 2012-07-12, 17:49   #11
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3·29·83 Posts
Default

Quote:
Originally Posted by kar_bon View Post
This page with endings was done only once in March and contains still some error (missing lines and some doubled).
I've just downloaded that page and compared it with the one from March: no changes!
Syd said then, this page will not be updated because of too much DB-accesses.
Ah... good thing I'm tracking the ID's then.
Dubslow is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Russian Blue xilman Science & Technology 24 2015-10-16 00:03
How do you explain the colour blue to someone that was born blond? retina Lounge 32 2015-03-28 13:08
Blue-eyed Islanders fetofs Puzzles 28 2005-11-03 15:50
Blue Gene\L ixfd64 Hardware 9 2005-11-01 11:34
Blue Screen Of Death dave_0273 Hardware 17 2005-05-19 14:33

All times are UTC. The time now is 21:03.

Mon Aug 3 21:03:01 UTC 2020 up 17 days, 16:49, 0 users, load averages: 1.44, 1.41, 1.43

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.