mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > Aliquot Sequences

Reply
 
Thread Tools
Old 2021-04-14, 12:44   #23
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

47×79 Posts
Default

Thanks Jean-Luc! This will give me some direction to head in. Some seem that they will be easy to implement, others might be challenging.

I've tentatively added a place holder for a routine to create an update file that can be be used for just catching new terminations and merges, but since it would need to run all open-ended sequences and that is probably how you update regina_file, I haven't created that yet. How often do you update the existing file? Would the update feature be of use? My intention was to create a second file, leaving regina_file alone. Then, the extra would update the arrays in the program.
EdH is offline   Reply With Quote
Old 2021-04-14, 14:27   #24
garambois
 
garambois's Avatar
 
"Garambois Jean-Luc"
Oct 2011
France

10001111002 Posts
Default

Quote:
Originally Posted by EdH View Post
Thanks Jean-Luc! This will give me some direction to head in. Some seem that they will be easy to implement, others might be challenging.
Let's be careful, I don't know if my ideas will work.
Some of my ideas must be very naive, I am not a professional mathematician !
I work more with my intuition than with my reason.
You have seen that I have stated conjectures that can be proved in one line or that were already known.
I hope that some of my ideas will be valid if you orient your work to what I have said, but we have no guarantee of that !
But maybe also that at the end of the work, we could find a new "not naive" conjecture ;-) !


Quote:
Originally Posted by EdH View Post
I've tentatively added a place holder for a routine to create an update file that can be be used for just catching new terminations and merges, but since it would need to run all open-ended sequences and that is probably how you update regina_file, I haven't created that yet. How often do you update the existing file? Would the update feature be of use? My intention was to create a second file, leaving regina_file alone. Then, the extra would update the arrays in the program.
How often do I update the existing file ?
That is the whole problem.
To complete regina_file from 1 to 1e6, it took a few hours.
To complete it from 14e6 to 15e6, it takes months.
What takes time in the program is to search for the numbers in the tables which become huge !
My updates to the regina_file only consist in adding parts.
I don't change the lines that are already there.
However, if you want to update all the existing lines by scanning the Open-end sequences on FactorDB, that's great and I'll be very interested in this completed "regina_file".
But I think it would be quite complicated to be able to modify all 14 variables in each line without making an error !
I don't know if I understood correctly, is that what you want to do ?
It seems to me very complicated !
Unless you don't modify all 14 variables ?
garambois is online now   Reply With Quote
Old 2021-04-14, 15:07   #25
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

47·79 Posts
Default

Quote:
Originally Posted by garambois View Post
. . .
But I think it would be quite complicated to be able to modify all 14 variables in each line without making an error !
I don't know if I understood correctly, is that what you want to do ?
It seems to me very complicated !
Unless you don't modify all 14 variables ?
I'm not currently making use of all the elements in regina_file, but to have an accurate update file, I will need to include all elements that I do use, i.e. if I start using numbers of maximums and parity changes, those elements will need to be included.

Let's see if I can give a brief description of my thoughts:

1. Leave regina_file unchanged, but load the currently used elements.
2. Run through the sequences, looking only for open-ended (o-e) ones, since the others should not have changed.
3. If an o-e sequence has terminated or merged, write a line (of at least the currently used elements) to the update file.
4. Next time the program is run, after reading the original regina_file, if the update file exists, it is used to update those sequences within the program arrays.

This would give the program the most up to date data based on the most recent update.

I imagine the update file could be a reference for which elements of regina_file might need further work eventually, but how important is that currency? If only a handful of o-e sequences have changed, are the hours of running an update worth the gain.

As to whether to modify all the variables, based on how many sequences have changed since the reginal_file run, it may be better to include all elements. That way, if I add any reports, I wouldn't have to run update again to gather the rest (although running only those in the update file might be relatively quick). My problem there is I don't fully understand all of the elements, yet.

For now, I'll work with the current regina_file and leave the update issue simmering. . .

BTW, if you are simply appending your current expansion for the regina_file, the program should already be able to accept that expanded version, as long as the name remains regina_file. It shouldn't bother the original, but I'd use a copy, anyway instead of the current one just to be sure. (Maybe, if you have a backup from some point you could copy it into a unique directory with my program and try it. After it reads the new reginal_file, it should display the new counts.)
EdH is offline   Reply With Quote
Old 2021-04-15, 05:05   #26
Happy5214
 
Happy5214's Avatar
 
"Alexander"
Nov 2008
The Alamo City

25916 Posts
Default

Quote:
Originally Posted by garambois View Post
2) A very simple way to visualize the data would also be to be able to launch a regina_file analysis by entering this in a program for example :
[n%2==0, a==0, b, c, d, e==0, f, g, h, i, j, 1.7<k<2.3, l, m]
Thus, for each of the 14 variables in each row of regina_file, we could specify characteristics.
In the example above, the analysis would give us all the sequences that start with an even number, that are Open-End, that have no peak (so they are strictly increasing) and that have a slope of about 2 (so at each iteration, the size of the terms multiplies by about a factor of 2).
This entry would allow us for example to find all the sequences that have the same very special graph as the sequence 19560.
Thus, just for all the sequences with 0 peaks, there are several types : we could find again the drivers which are the perfect numbers by specifying a slope k rather close to 1, find the guides which ensure slopes of 2, and above all, maybe see things not yet known ...
Another example : you want a bell-shaped sequence.
You enter :
[n, a==1, b, c, d, e==1, f, g, h, i, j, 0.8<k<1.2, l, m]
And you find sequences such as 2174880.
Of course, the goal is to try to notice correlations between these forms of graphs and the factorization in prime factors of the starting number of the sequence (and this correlation exists at least for the sequences with 0 peaks, because of the perfect numbers drivers, of the 2-perfect numbers, of the 3-perfect numbers...) or according to the prime number which ends the sequences (belonging to such or such branch of the infinite graph of the aliquot sequences).

I have the same questions about the number of parity changes for each sequence.


But as said above, all this is a very long work started years ago.
And my problem is that in python, I can't do this work anymore, because regina_file is too big.
I have to work in C, and there, I'm not at ease !
If you're going to do queries like that, you may want to convert the data into an SQLite database (no telling how big it would be, but it would likely be bigger than the uncompressed regina_file) and add some indices to make querying faster. I wonder if bundling a conversion program for the user to run in a reasonable amount of time is feasible (SQLite is available as two files, a C source and header, so distribution of that library is easy) so they don't have to download an even bigger file.
Happy5214 is offline   Reply With Quote
Old 2021-04-15, 17:08   #27
garambois
 
garambois's Avatar
 
"Garambois Jean-Luc"
Oct 2011
France

22·11·13 Posts
Default

Quote:
Originally Posted by EdH View Post
BTW, if you are simply appending your current expansion for the regina_file, the program should already be able to accept that expanded version, as long as the name remains regina_file. It shouldn't bother the original, but I'd use a copy, anyway instead of the current one just to be sure. (Maybe, if you have a backup from some point you could copy it into a unique directory with my program and try it. After it reads the new reginal_file, it should display the new counts.)
I don't know if I understand you correctly ?
Would it help you if I put the regina_file up to 14460000 online ?
This is where I am at the moment.
(I paused the calculations for a few days, as I am perfecting my program for calculating sequences : I want to be able to use all threads simultaneously for the ecm and NFS methods.)
garambois is online now   Reply With Quote
Old 2021-04-15, 18:01   #28
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

47×79 Posts
Default

Quote:
Originally Posted by Happy5214 View Post
If you're going to do queries like that, you may want to convert the data into an SQLite database (no telling how big it would be, but it would likely be bigger than the uncompressed regina_file) and add some indices to make querying faster. I wonder if bundling a conversion program for the user to run in a reasonable amount of time is feasible (SQLite is available as two files, a C source and header, so distribution of that library is easy) so they don't have to download an even bigger file.
I've never been able to figure out how to use sqlite. I suppose that's something I should do sometime. I think I can implement much of the suggestions in some form, however I can't grasp the concept of how to display the factor chain for all the composite sequence first terms.

Quote:
Originally Posted by garambois View Post
I don't know if I understand you correctly ?
Would it help you if I put the regina_file up to 14460000 online ?
This is where I am at the moment.
(I paused the calculations for a few days, as I am perfecting my program for calculating sequences : I want to be able to use all threads simultaneously for the ecm and NFS methods.)
I was merely suggesting that the program should already work with your current file, but a copy might be better, if your program that is adding sequences has the file open for processing. I don't think I need the newer one at this point, but in testing my program, you should be able to use the newer one.
EdH is offline   Reply With Quote
Old 2021-04-16, 17:27   #29
garambois
 
garambois's Avatar
 
"Garambois Jean-Luc"
Oct 2011
France

22×11×13 Posts
Default

Quote:
Originally Posted by EdH View Post
I was merely suggesting that the program should already work with your current file, but a copy might be better, if your program that is adding sequences has the file open for processing. I don't think I need the newer one at this point, but in testing my program, you should be able to use the newer one.

OK, great : the program works perfectly with the file up to 1446e4 !
garambois is online now   Reply With Quote
Old 2021-04-16, 21:52   #30
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

72018 Posts
Default

Quote:
Originally Posted by garambois View Post
OK, great : the program works perfectly with the file up to 1446e4 !
Excellent! Thanks for letting me know.
EdH is offline   Reply With Quote
Old 2021-04-19, 08:15   #31
kar_bon
 
kar_bon's Avatar
 
Mar 2006
Germany

23×192 Posts
Default

As usual I'm using (g)awk for processing text-files and here's my solution for handling the regina_file.

First I delete all brackets and spaces from the original regina_file by calling (here under WIN):
Code:
gawk -f list.awk regina_file
and using the "list.awk" source with
Code:
BEGIN { x=1 }
{
  line=$0
  sub(/\[/,"",$0)
  sub(/]/,"",$0)
  gsub(/ /,"",$0)
  print $0 >"regina_file_new"
  x++
}

END {
print "last: "x
}
This is running ~2 minutes to process all 14 mill. entries and produces a ~200 MByte smaller file.

For an example of running a query you have to split every line into an array and comparing with the wanted values.
Example:
The query for "19560" could look like this:
Code:
BEGIN { x=1; found=0
}

{ line=$0
  split(line,a,",")
  x++
  if (a[1]%2==0)
    if (a[2]==0)
      if (a[6]==0)
        if ((a[12]>1.7) && (a[12]<2.3))
        { found++
          print found": "a[1]
        }
}

END {
print "last: "x
}
Note:
The query produces/finds many! lines with those parameters: you should output the results into a file and also bound the n-values (=a[1]).

The example for "496" is like:
Code:
BEGIN { x=1; found=0
}

{
  line=$0
  split(line,a,",")
  x++
  if (a[4]==496)
    { found++
      print found": "a[1]
    }
}

END {
print "last: "x
}
This runs ~1.5 min and finds 45 values.

Calling these by the fitted statements like
Code:
gawk -f 496.awk regina_file_new
is no problem.

I think many of such queries can be done by nested IF-statements like above examples.

Disadvantage: Every query-run the whole regina_file_new has to be processed again and can not be saved into a big array.

Note: To update or insert a new parameter to the original regina_file it's also possible to use awk here: a new parameter calculated from the given or any external file can do this easily without loading the whole file into a text editor or in any other matter.
kar_bon is offline   Reply With Quote
Old 2021-04-19, 11:25   #32
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

47·79 Posts
Default

Thanks kar_bon,

The one issue of constantly reading the file was the main driver for my move to reading it once and then working with the arrays. Right now, I'm working with a new idea that may force a complete rewrite of the source. But, I'm also looking at some additions to the current version.

I do appreciate all thoughts.
EdH is offline   Reply With Quote
Old 2021-04-19, 13:05   #33
garambois
 
garambois's Avatar
 
"Garambois Jean-Luc"
Oct 2011
France

23C16 Posts
Default

Many thanks Karsten.

I am rewriting the entire program to add the geometric means.
This is the right time to change the format of the regina_file if necessary.
Do you think I should drop the "[", and the "]" and the "," and replace only the "," with spaces ?
So, in the new regina_file, this line :
[2, 1, 1, 2, 1, 0, 1, 0, 0, 0, 0, 0.5000000000, 0.5000000000, 0.5000000000]
would be replaced by this one :
2 1 1 2 1 0 1 0 0 0 0 0.5000000000 0.5000000000 0.5000000000 0.5000000000
The fifteenth value is the geometric mean.
garambois is online now   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Aliquot sequence reservations schickel Aliquot Sequences 3455 2021-04-30 21:16
A new tool to identify aliquot sequence margins and acquisitions garambois Aliquot Sequences 24 2021-02-25 23:31
Extending an aliquot sequence backwards arbooker Aliquot Sequences 5 2020-11-07 15:58
Another Aliquot Sequence site schickel Aliquot Sequences 67 2012-01-20 17:53
Useful aliquot-sequence links 10metreh Aliquot Sequences 2 2009-07-31 17:43

All times are UTC. The time now is 19:20.

Sat May 8 19:20:39 UTC 2021 up 30 days, 14:01, 0 users, load averages: 3.49, 3.52, 3.53

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.