mersenneforum.org > Math Efficiently finding a linear progression in data
 Register FAQ Search Today's Posts Mark Forums Read

 2015-10-21, 15:48 #1 fivemack (loop (#_fork))     Feb 2006 Cambridge, England 13×491 Posts Efficiently finding a linear progression in data Say I have set of a few hundred thousand 64-bit integers. Is there a cleverer way of finding cases where three of them are in arithmetic progression than trying all possible A, all possible B, and seeing (using a hash-table or similar) whether 2B-A is in the set? Obviously, for the stupid algorithm, I can look for longer arithmetic progressions: just keep looking until (k+1)*B-k*A is not in the set. Is there something cleverer I can do? ( Sorting the set of differences feels like a way to start, but I suspect that is just as expensive as the stupid algorithm because the set of differences is of length O(n^2) )
 2015-10-21, 16:05 #2 paulunderwood     Sep 2002 Database er0rr 53×29 Posts For my part in searching for titanic AP9s, a primorial form was chosen for density and the I ran 16 threads on a bit map -- 1 bit representing prime and 0 a composite. I think I had to use ~5.5 GB map. The downside was the overhead fetching a byte from RAM and decoding it. If I had had more RAM I would have used a byte per prime. Each thread stepped through the map looking for AP2, AP3, AP4 etc. We collected AP5s (or more) so that we could look outside the range for (probable) primes. Over about 2 years we found 5 titanic AP9s. Last fiddled with by paulunderwood on 2015-10-21 at 16:35
2015-10-21, 16:06   #3
science_man_88

"Forget I exist"
Jul 2009
Dumbassville

203008 Posts

Quote:
 Originally Posted by fivemack Say I have set of a few hundred thousand 64-bit integers. Is there a cleverer way of finding cases where three of them are in arithmetic progression than trying all possible A, all possible B, and seeing (using a hash-table or similar) whether 2B-A is in the set? Obviously, for the stupid algorithm, I can look for longer arithmetic progressions: just keep looking until (k+1)*B-k*A is not in the set. Is there something cleverer I can do? ( Sorting the set of differences feels like a way to start, but I suspect that is just as expensive as the stupid algorithm because the set of differences is of length O(n^2) )
modular remainder by the difference you want comes to mind but then you have to check for two that are equal.

2015-10-21, 16:57   #4
R.D. Silverman

Nov 2003

746010 Posts

Quote:
 Originally Posted by fivemack Say I have set of a few hundred thousand 64-bit integers. Is there a cleverer way of finding cases where three of them are in arithmetic progression than trying all possible A, all possible B, and seeing (using a hash-table or similar) whether 2B-A is in the set? Obviously, for the stupid algorithm, I can look for longer arithmetic progressions: just keep looking until (k+1)*B-k*A is not in the set. Is there something cleverer I can do? ( Sorting the set of differences feels like a way to start, but I suspect that is just as expensive as the stupid algorithm because the set of differences is of length O(n^2) )
The following might work. It is very off the cuff/shooting from the hip.......

Sort.
Compute the successive differences.
Compute the accumulated sums of the differences and do a bucket sort on them.
For each non-empty cell X in the bucket see if 2X is also non-empty.

e.g.

1 7 9 20 25 39 49

differences are:

6 2 11 5 14 10

The accumulated differences are:

8 19 24 38 48. Thus, place a 1 in each cell of the bucket in locations 8, 19, 24, 38, 48.

Now, if index k has a 1, see if index 2k also has a 1.

We see that location 19 is lit, so is 38. --> Thus, 1, 20, 39
We see that location 24 is lit, so is 48, --> Thus 1, 25, 49

 2015-10-21, 17:10 #5 jasonp Tribal Bullet     Oct 2004 DD116 Posts Does the Hough transform work in three dimensions?
2015-10-21, 17:16   #6
R. Gerbicz

"Robert Gerbicz"
Oct 2005
Hungary

1,459 Posts

Quote:
 Originally Posted by fivemack ( Sorting the set of differences feels like a way to start, but I suspect that is just as expensive as the stupid algorithm because the set of differences is of length O(n^2) )
You can reduce this problem to the famous 3sum problem (https://en.wikipedia.org/wiki/3SUM) There was a major breakthrough on this problem in 2014, interestingly in the same year there was a seminar about the article in Hungary. (not participated on it). Don't know how fast this new algorithm in practice, but seeing the complexity it can't be much faster.
Note that the 3sum is a decision problem, but I think that the new alg. also finds the triplet (if the answer is yes) just like the easy O(n^2) alg.

For the reduction (it is pretty trivial):
Say array "a" holds the integers. Create three arrays X,Y,Z:

X[i]=10*a[i]+1
Y[i]=10*a[i]+2
Z[i]=-20*a[i]-3

For the concatenation of these three array run the 3sum algorithm. Seeing the last digit there is only one way to get zero: if you use one term from X and Y and Z array, for such hit:
X[i]+Y[j]+Z[k]=0, then
10*a[i]+10*a[j]-20*a[k]=0, so
a[k]=(a[i]+a[j])/2 what you needed. And in the other direction: if (a[i]+a[j])/2=a[k], then X[i]+Y[j]+Z[k]=0.

Last fiddled with by R. Gerbicz on 2015-10-21 at 17:22

2015-10-21, 17:39   #7
xilman
Bamboozled!

"πΊππ·π·π­"
May 2003
Down not across

101001100110002 Posts

Quote:
 Originally Posted by jasonp Does the Hough transform work in three dimensions?
Yes, it works in arbitrarily large number of dimensions, AFAIK. It also works in many (all? don't know for sure) smooth manifolds --- meaning you can look for curves such as conic sections by deforming the manifold until the curves are define by a set of linear equations.

A good imaging processing text might be a way into this field. Image processing and machine vision is where I learned about the Hough transform and its applicability.

2015-10-21, 18:30   #8
R.D. Silverman

Nov 2003

746010 Posts

Quote:
 Originally Posted by xilman Yes, it works in arbitrarily large number of dimensions, AFAIK. It also works in many (all? don't know for sure) smooth manifolds --- meaning you can look for curves such as conic sections by deforming the manifold until the curves are define by a set of linear equations. A good imaging processing text might be a way into this field. Image processing and machine vision is where I learned about the Hough transform and its applicability.
I'm sure that it works for smooth genus 0 manifolds. However, I think the Riemann-Roch theorem might get in the way for
higher genus. I need to check into this; my algebraic geometry knowledge is pretty basic.

2015-10-22, 07:09   #9
LaurV
Romulan Interpreter

Jun 2011
Thailand

11·853 Posts

Quote:
 Originally Posted by R.D. Silverman e.g. 1 7 9 20 25 39 49 differences are: 6 2 11 5 14 10 The accumulated differences are: 8 19 24 38 48.
1. Wouldn't those always be identical with the string elements, minus the first element? 8=9-1, 19=20-1, 24=25-1, etc?
2. How does this solves the bi-dimensional problem? (i.e a progression can start from 7, or from 9, on your example, not always from 1)
3. This is related to "giving a n, find elements in a vector that sum to n" problem. edit: scrap this, already discussed (the link is good to read)

Last fiddled with by LaurV on 2015-10-22 at 07:16

2015-10-22, 09:40   #10
R.D. Silverman

Nov 2003

22×5×373 Posts

Quote:
 Originally Posted by LaurV 1. Wouldn't those always be identical with the string elements, minus the first element? 8=9-1, 19=20-1, 24=25-1, etc?

Yes. I was clearly in error.

 2015-10-23, 09:57 #11 fivemack (loop (#_fork))     Feb 2006 Cambridge, England 18EF16 Posts {53307 258162 463017} is, among three-term arithmetic progressions with n*2^10000+1 prime, the one with smallest largest term. {10838076 13044603 15251130 17457657} is the same for four-term arithmetic progressions The first five-term one has largest term >2^28 I can't find those particular numbers on the Web already, but I suppose people pick the place to search pretty much at random. For n*2^4000+1: 3 42537 56685 70833 4 2131248 3470313 4809378 6148443 5 86742492 99272412 111802332 124332252 136862172 Last fiddled with by fivemack on 2015-10-23 at 12:51

 Similar Threads Thread Thread Starter Forum Replies Last Post Puzzle-Peter Software 156 2019-06-03 20:19 storm5510 Information & Answers 15 2017-07-26 18:02 literka Math 0 2013-06-01 12:42 NBtarheel_33 Data 2 2010-09-02 03:14 Unregistered Information & Answers 1 2010-04-04 22:06

All times are UTC. The time now is 04:49.

Wed Apr 21 04:49:51 UTC 2021 up 12 days, 23:30, 0 users, load averages: 1.30, 1.29, 1.40