mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Data

Reply
 
Thread Tools
Old 2005-08-24, 18:23   #1
delta_t
 
delta_t's Avatar
 
Nov 2002
Anchorage, AK

3·7·17 Posts
Exclamation Long-term Primenet archive

As a favor to this community I've been archiving and hosting the hourly Primenet reports (summary.txt, status.txt. cleared.txt) as well as mirroring the software ftp directory here: http://www.mersenneforum.org/primenet

As of today this archive has grown to about 30GB. I've been burning DVD's regularly and actually sending a backup set over to Xyzzy for "safe keeping", but I'm wondering if it'd be a good idea to let another entity crawl the site and keep the older files in their archivess. Naturally I'm talking about www.archive.org.

I've restricted all search engines from crawling the site (with the use robots.txt) to minimize traffic and also not to bog down those indexing/archiving servers with a massive file load. Basically restricting these sites was done as a favor to them so they don't all of a sudden end up with these 30GB archived Primenet files (archived files are compressed with bzip2).

So what I'd like to hear from the community is whether it would be helpful to keep these files around for the long term? Are these files worth saving for the future, or are they only helpful for the short term?

I will still keep this server up as long as I have the resources available, but just want to hear what people think about long term archiving.
delta_t is offline   Reply With Quote
Old 2005-08-24, 18:34   #2
garo
 
garo's Avatar
 
Aug 2002
Termonfeckin, IE

32·307 Posts
Default

I think it is useful to keep files for the long term though perhaps not 24x3 files a day. It would be much more practical and easier for you to say keep one or two files a day. Or even maybe come up with a way to keep diffs of the hourly files instead of entire files. That is if space is an issue.

I usually find such files useful to refer back to for at least one year past their date.
garo is offline   Reply With Quote
Old 2005-08-24, 18:52   #3
delta_t
 
delta_t's Avatar
 
Nov 2002
Anchorage, AK

3×7×17 Posts
Default

It's no problem for me to archive these files and keep collecting them on the hourly basis since I have the resources available to me right now (I have it on it's own dedicated server actually). But should I open up the robots.txt to archive.org's crawler and have it try to archive what we already have? This is so there would be a long term site having the older files stored away in case something does happen where we can't put up or host the huge archive of older files. As an example, last year's archive (2004) from June-December was 18G (started archiving again in June).
delta_t is offline   Reply With Quote
Old 2005-08-25, 00:31   #4
cheesehead
 
cheesehead's Avatar
 
"Richard B. Woods"
Aug 2002
Wisconsin USA

11110000011002 Posts
Default

I'm in favor of keeping them all, somewhere. You never know when they'll be very valuable for some reason.

... though probably just to some grad student doing a thesis on the history of distributed computing.

Last fiddled with by cheesehead on 2005-08-25 at 00:34
cheesehead is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
How Big Can an SNFS Constant Term Be? wblipp Factoring 14 2015-03-31 23:05
Using long long's in Mingw with 32-bit Windows XP grandpascorpion Programming 7 2009-10-04 12:13
I think it's gonna be a long, long time panic Hardware 9 2009-09-11 05:11
long-standing PrimeNet issues and how to fix them ixfd64 PrimeNet 16 2008-11-17 07:53
Short term goal em99010pepe No Prime Left Behind 94 2008-03-24 21:02

All times are UTC. The time now is 15:13.

Sun Apr 18 15:13:09 UTC 2021 up 10 days, 9:54, 0 users, load averages: 2.05, 1.66, 1.54

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.