mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > FactorDB

Reply
 
Thread Tools
Old 2022-10-02, 04:26   #23
retina
Undefined
 
retina's Avatar
 
"The unspeakable one"
Jun 2006
My evil lair

664110 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
It was t-online.de that has blacklisted my server (presumably to this day), because:

<snip ridiculous t-online "reason">

So, because I don't wish to publish (what, my home address?) on a contact page for mersenne.ca, my server is considered computa non grata and will never be able to send email to @t-online.de
You could publish bogus contact information. It's not like anyone can simply contact Google with any of the information they publish online, so bogus details wouldn't be any different in that regard.

And if t-online really do try to contact you using the bogus details, and can't get through, then you lose nothing. What can they do, ban you harder?

Last fiddled with by retina on 2022-10-02 at 04:26
retina is online now   Reply With Quote
Old 2022-10-02, 16:08   #24
chris2be8
 
chris2be8's Avatar
 
Sep 2009

23·7·43 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
Did you email him from a common email provider like Gmail?
No, a special address set up for my sister. So it's not likely to be in any blacklists. And so far (touch wood) spammers havn't found it either.

He'll probably reply to any reasonably polite request if it gets to him.
chris2be8 is offline   Reply With Quote
Old 2022-10-02, 18:21   #25
kruoli
 
kruoli's Avatar
 
"Oliver"
Sep 2017
Porta Westfalica, DE

2×613 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
So, because I don't wish to publish (what, my home address?) on a contact page for mersenne.ca, my server is considered computa non grata and will never be able to send email to @t-online.de
Why do they think your homepage is commercial?

In Germany, we have quite strict laws regarding imprints of websites. What is usually done for those who do not want to publish their personal information here is to find someone else (there are paid services for this) who are willing to put their contact information there instead. This is usually combined with some kind of legal document that frees the volountary from any liability and transfers it to the owner of the site.
kruoli is offline   Reply With Quote
Old 2022-10-04, 07:23   #26
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
"name field"
Jun 2011
Thailand

3·5·683 Posts
Default

About storing stuff for backup purpose (and other), we are also willing to participate with storage for everything related to aliquot sequences (including the "powers" sequences).

Avoiding to store other stuff will make our DB tough against spammers in case we ever go public with it, i.e. not all numbers will be accepted, but only those factoring aliquots.
We wanted for a long time to create a personal DB, even thought of a format to optimally store those numbers to be easy to parse and see the "trees", merges, etc., and not waste the storage space too much, but the concept remained on paper, never materialized in a piece of software, due to time and laziness reasons.

We have however a personal DB with almost all under-100-digits indexes and their factors, for starters under a million, and under-80-digits for higher starters up to 3M or so, but if all the info from the factorDB disappears, restoring the actual status from our DB will take ages, because the most of the work is needed for the larger (over 100 digits) numbers that are stored there. There are thousands of "hard" factorisations there (I mean, in factorDB, not in my DB) for which people spent weeks and months and thousands of core-hours, and which would be a pity to be lost.

Assuming that the computers in 10 years or so will not become so fast (or new algorithms will not be found) to allow making all the fuss we worked for years, factorable in just few minutes

Last fiddled with by LaurV on 2022-10-04 at 07:28
LaurV is offline   Reply With Quote
Old 2022-10-05, 00:58   #27
R. Gerbicz
 
R. Gerbicz's Avatar
 
"Robert Gerbicz"
Oct 2005
Hungary

7×229 Posts
Default

Quote:
Originally Posted by LaurV View Post
There are thousands of "hard" factorisations there (...) or new algorithms will not be found to allow making all the fuss we worked for years, factorable in just few minutes
Do not bet on it.
R. Gerbicz is offline   Reply With Quote
Old 2022-10-12, 03:50   #28
Happy5214
 
Happy5214's Avatar
 
"Alexander"
Nov 2008
The Alamo City

11010111112 Posts
Default

Quote:
Originally Posted by LaurV View Post
About storing stuff for backup purpose (and other), we are also willing to participate with storage for everything related to aliquot sequences (including the "powers" sequences).
[...]
That's an idea. Mirrors/torrents of the aliquot elf files and select other data, such as the data from other "established" factoring projects (in some sort of agreeable archive format, like XZ tarballs), may be a more efficient use of space than mirroring the whole database, with its attendant junk data. It's also database format-neutral, allowing us to revamp the database schema if we have to rebuild the site while still keeping the core data.

Last fiddled with by Happy5214 on 2022-10-12 at 03:52 Reason: Clip
Happy5214 is offline   Reply With Quote
Old 2022-10-14, 09:41   #29
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
"name field"
Jun 2011
Thailand

3×5×683 Posts
Default

Quote:
Originally Posted by Happy5214 View Post
That's an idea. <...> may be a more efficient use of space than mirroring the whole database
Sure, storing all the elfs would be extremely inefficient. Most of info repeats (due to mergers, etc). That is what I mean by

Quote:
Originally Posted by LaurV View Post
We <...> even thought of a format to optimally store those numbers to be easy to parse and see the "trees", merges, etc., and not waste the storage space too much
like, storing on each record the number, a flag to know if it is full factored, its known factorization, together with two other things: the lowest sequence starter, and the longest sequence starter. That would be enough to reconstruct very fast all the trees. When a number is factored, it will need a minimum "touch" of other DB records (when mergers happen), which will be very fast, and no redundant info stored. This DB would be quite fast to parse to work for a specific sequence, or to add new sequences. There are few caveats that need to be detailed, however, for example the order records are stored (and why) or the how to split the files (this DB will get large quick, and splitting in smaller files not only make it more manageable, but also easier to backup, restore, reconstruct in case of damage, etc), but this is the idea.
LaurV is offline   Reply With Quote
Old 2022-10-14, 23:40   #30
Happy5214
 
Happy5214's Avatar
 
"Alexander"
Nov 2008
The Alamo City

11010111112 Posts
Default

Quote:
Originally Posted by LaurV View Post
Sure, storing all the elfs would be extremely inefficient. Most of info repeats (due to mergers, etc).
That makes it particularly suitable for compression, especially if you strip the indices from the lines, but moving on.

Quote:
Originally Posted by LaurV View Post
[...]storing on each record the number, a flag to know if it is full factored, its known factorization, together with two other things: the lowest sequence starter, and the longest sequence starter. That would be enough to reconstruct very fast all the trees. When a number is factored, it will need a minimum "touch" of other DB records (when mergers happen), which will be very fast, and no redundant info stored. This DB would be quite fast to parse to work for a specific sequence, or to add new sequences. There are few caveats that need to be detailed, however, for example the order records are stored (and why) or the how to split the files (this DB will get large quick, and splitting in smaller files not only make it more manageable, but also easier to backup, restore, reconstruct in case of damage, etc), but this is the idea.
Several questions:
  1. Is this an SQL database, NoSQL, or a bunch of flat files?
  2. You do realize not every sequence merges directly into the longest preceding sequence, right? I don't know if that causes an issue.
  3. How does this handle terminating (whether at primes or cycles/perfect numbers) sequences?
  4. Every term is stored, right?
  5. Where are the indices?
  6. How does this result in less redundant info than FactorDB's "linked list aliquot sum" approach?
Happy5214 is offline   Reply With Quote
Old 2022-10-15, 05:05   #31
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
"name field"
Jun 2011
Thailand

3·5·683 Posts
Default

Quote:
Originally Posted by Happy5214 View Post
That makes it particularly suitable for compression, especially if you strip the indices from the lines, but moving on.

Several questions:
  1. Is this an SQL database, NoSQL, or a bunch of flat files?
  2. You do realize not every sequence merges directly into the longest preceding sequence, right? I don't know if that causes an issue.
  3. How does this handle terminating (whether at primes or cycles/perfect numbers) sequences?
  4. Every term is stored, right?
  5. Where are the indices?
  6. How does this result in less redundant info than FactorDB's "linked list aliquot sum" approach?
Re compression: what compresses better, duplicate (redundant) information, or no information at all? Think about it.
Re questions:

1. No. Flat text files. Advantage is that they can be opened by anything. Reconstructing the trees or searching for a sequence would be quite easy. They will be sorted by the stored numbers. A separate index/hash file is maintained, which will help with searching. This compresses very well, however an internal compressed format will not only "murky" things (no "see with notepad" possible) but it will never beat the continuous development of external packer packages (Winzip, RAR, 7-Zip, sPArCK, etc), including encryption standards, etc. The user is free to pack its files the way he likes in case he wants to send them somewhere.

2. Yes, and there is no issue.

3. It does not. This is the job of the guy reading/interpreting it (the app used to show/add/edit/work the sequences). FDB works kinda the same. This poses no issue for terminating sequences. The primes are also stored in the DB, they are numbers, like any, and have their own lines. A line with a prime would look like "prime 1 a b", or "prime a b 1", where a, b, are the lowest sequence terminating in that prime and respectively the longest sequence, and "1" is the factor/flag (see below). There is a good reason why these need to be stored (for any line, not only for primes). Why would you need to store other info? Give me a real example of a case when you need more. Note that prime lines don't really need a flag, as the only factor is 1, this is flag-enough itself.

4. Yes. Every term that appears in at least one sequence is stored, prime or not (but their factors are not stored on separate lines, unless they appear in a sequence). Not-prime lines look like "number factor factor factor a b", they may have a "flag" to show if factorization is full or not (may speed up the reconstruction, or make it more clear for the guy who reads that without the app, or ease the searching for things to factor, but this is not really necessary - in case a flag is present it can contain info about cycles, etc, but again, this is not really necessary). The "factor" part are prime factors in order, they may have powers or not (again, irrelevant for application - application can try to multiple divide factors as long as it can - but clearer for the guy who "manually" reads the text files). For example the first million lines in the file will be the first million numbers, with their factors and their "a" and "b".

5. Indices? Why do you need them? Any number you want to search, go to it (the external hash file helps), and your app can reconstruct the sequence starting from that number, or starting from the smallest starter or the longest sequence getting to that number, in a split second, and show it to you on screen, eventually in a very nice graphic/tree form. With indices, too . This way will detect the mergers very fast, as a side effect. You start a sequence, and (unless it is in the same time the smallest and the longest starting from that number) you read the number, compute its sigma, read next number, compute its sigma, etc, somewhere on the way you will find an "a" and "b" (grrr, stupid notation I chose!) which are different, you know you have merged with a lower or longer sequence. Additionally, indices are confuse, they depend of where you start from. FDB doesn't store them either, they are computed once you specify the sequence you want to see.

6. It does not. I don't propose a way to "minimize the space" (note: I didn't talk anything about compression, or so - I have a 30TB SDD bought for $25, and I assume the storage space will be cheaper and cheaper in the future). Let's be clear, the info in FDB is not redundant (in the way we define redundancy, like storing some futile info, or storing the same info multiple times). Every number is stored only once, and every number has its own unique index (unique key*). You can see these indexes/keys in the links when you look for a number in the DB, any number has its own key, regardless of the fact that it is a part of an aliquot sequence or not. When you look to two sequences that "merge", and it shows you the same info two times, it doesn't mean that the info is stored two times, once for each sequence. It is not. It is the "application" (database's elves) that "reconstruct" the sequence, calculate sigma for each line, get to that index, get the info of the new index, increment the index, calculate new sigma, check for cycle, etc. These things are not stored in the DB. What FDB does "better", is that FDB stores info about how each factor was found, when, by who, etc. It should be VERY nice to keep this info, or even do it in a correct way in the future (remark the quotes for "better", FDB tries to do this, in fact, but not always successful, and the way user accounts are handled, sucks - I was talking about this in the past, here around). However, I didn't think about how this "correct" way will look in practice. Maybe separate files? Add a "comment code" on each line of the file and keep those comments in a separate file? How to make it "tamper proof"? Blockchain? hihi... This needs (a lot) more thinking.

--------------
* about these "unique keys", we understand their utility in the FDB context, but in our case, only talking about aliquot stuff, they are futile. The numbers itself are the "keys" as we store them "uniquely".

Last fiddled with by LaurV on 2022-10-15 at 05:55
LaurV is offline   Reply With Quote
Old 2022-10-15, 06:27   #32
retina
Undefined
 
retina's Avatar
 
"The unspeakable one"
Jun 2006
My evil lair

11001111100012 Posts
Default

Quote:
Originally Posted by LaurV View Post
I have a 30TB SDD bought for $25 ...
Wot? How? Where?

Oh, SDD. A 30TB document is gonna take a while to read through. Good luck with your eyes, I hope they still function after reading that monster.
retina is online now   Reply With Quote
Old 2022-10-15, 07:37   #33
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
"name field"
Jun 2011
Thailand

1024510 Posts
Default

Bwhaa haha, ok, ok. I mean SSD. And yes, long live Aliexpress.
I don't want to advertise. You can search there for it. Type "30T SSD".
YMMV (I mean, you get the right price, but not always a working SSD - some are rubbish, but still can be used for storage - you can even get a 64T for $3, but that I won't try, they are most probably fake or factory rejects and may make a lot of white hair and nerves to you in the future).
My private storage space (this means, not provided by the employer or by online sites like google drive, but physical storage space I have in my house and can hold in my hand, like HDD, SSD, uSD cards, M2) will reach soon half peta.
Edit: some of the most expensive stuff in my collection are few of these badass thingies (quite expensive too, especially the newest/largest/fastest ones, but I use them quite often and they worth all the money - the E61 you see in the photo are not the fastest, they are 2 generations behind - and about the "extreme" part, they are really tough, you need a hammer to break them, they are water proof to 10 meters or more, and you can hang from that hook with a rope - see their promo videos - and a carabiner (not a cabriolet )).
Click image for larger version

Name:	ssds - not sdds.jpg
Views:	98
Size:	134.7 KB
ID:	27452

Last fiddled with by LaurV on 2022-10-15 at 09:15
LaurV is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
seasonal or long term trends kriesel Cloud Computing 19 2021-05-26 16:51
Long term evidence of civilization. xilman Science & Technology 65 2021-05-07 12:26
Using long long's in Mingw with 32-bit Windows XP grandpascorpion Programming 7 2009-10-04 12:13
I think it's gonna be a long, long time panic Hardware 9 2009-09-11 05:11
Long-term Primenet archive delta_t Data 3 2005-08-25 00:31

All times are UTC. The time now is 22:32.


Sat Dec 3 22:32:34 UTC 2022 up 107 days, 20:01, 0 users, load averages: 0.96, 0.94, 0.84

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔