20220217, 21:58  #1 
May 2003
377_{8} Posts 
Simultaneously Square Sums of Mostly Squares
Late last year I came up with this puzzle, but brute force and ignorance are now beginning to stall when it comes to finding solutions, so I thought perhaps you guys would like a stab:
Find triples of positive integers a,b,c where all of three permutations of the notquitesumsofsquares a+b²+c², a²+b+c², and a²+b²+c, are perfect squares. Here's the background to the problem, and what I've got so far: http://fatphil.org/maths/trivia/squaresums.html 
20220218, 17:20  #2  
Feb 2017
Nowhere
14105_{8} Posts 
Quote:
If a ≤ b ≤ c then a^2 + b^2 + c ≤ a^2 + b + c^2 ≤ a + b^2 + c^2 Difference between first and second terms (cb)(b+c1) Difference between second and third terms (ba)(b+a1) Difference between first and third term (ca)(c+a+1) The differences perhaps give some information about the restrictions (mod 8), although these have already been worked out. Note that (b+c1) + (cb) = 2*c  1 is odd, and similarly for the factors of the other differences. Thus, all differences are the product of an even factor and an odd factor, hence all are even. If a difference of two squares is even, it is divisible by 4. [If both squares are even, both are divisible by 4. If both are odd, both are congruent to 1 (mod 8) so their difference is divisible by 8.] Therefore, if a^2 + b^2 + c, a^2 + b + c^2, and a + b^2 + c^2 are all squares, none of cb, b+c+1, ba, b+a+1, ca, or c+a+1 can be congruent to 2 (mod 4). Three of them must be divisible by 4. Last fiddled with by Dr Sardonicus on 20220218 at 17:21 Reason: xifnif posty 

20220218, 22:12  #3  
"Robert Gerbicz"
Oct 2005
Hungary
11001001011_{2} Posts 
Quote:
Code:
void *worker(void *data) { uint64_t a; uint64_t b; uint64_t c; (void)data; while ((a = ga++) < high) { if ((a & 7) == 5) continue; uint64_t rc=a; for (b = 1; b <= a; b++) { const uint8_t f3 = f2[a & 7][b & 7]; if (f3) { uint64_t a2b2 = a * a + b * b; while(rc*rc<a2b2)rc++; c=rc*rca2b2; if (c<=b && f3 & (1 << (c & 7)) && issquare(a * a + b + c * c) && issquare(a + b * b + c * c)) printf("%" PRIu64 " %" PRIu64 " %" PRIu64 "\n", c, b, a); } } } return NULL; } and that is c=(sqrtint(a^2+b^2)+1)^2(a^2+b^2), and replaced/saved also some sqrt operations in your code. [the next c value would be c>=(b+1)^2b^2>b, because sqrt(a^2+b^2)>=b] Tested, works for a<10^5, but check it. 

20220219, 21:25  #4  
May 2003
3·5·17 Posts 
Quote:
Code:
// the square greater or equal to a^2+b^2, and its root uint64_t rt_ge_a2b2 = a + 1; uint64_t sq_ge_a2b2 = a2 + 2 * a + 1; for (b = 1; b <= a; b+=db1) { db1 ^= db2; const uint64_t f3 = f2[a & 7][b & 7]; if (!f3) continue; const uint64_t a2b2 = a2 + b * b; while (sq_ge_a2b2 < a2b2) { sq_ge_a2b2 += 2 * rt_ge_a2b2 + 1; rt_ge_a2b2++; } uint64_t c = sq_ge_a2b2 a2b2; // only one c that creates a square a^2+b^2+c is <= b if (c <= b) { I've not fed my changes back into inz's repo yet. There's a few more percent available here or there through microoptimisations, but I was hoping for a new fresh mathematicallyoriented attack, rather than just taking the same approach. In particular as it would be nice to be able to doublecheck the results. Cheers for your input though, and clearly you've got the insights into the problem. Last fiddled with by fatphil on 20220219 at 21:27 

20220219, 21:35  #5 
May 2003
3×5×17 Posts 

20220219, 22:35  #6  
"Robert Gerbicz"
Oct 2005
Hungary
11001001011_{2} Posts 
Quote:
Code:
Using M=100800 with t=8 threads done a=1..8000000, found 126 solutions in 1627 sec. with the following code: (or download the attachment) Code:
#include <stdint.h> #include <math.h> #include <stdio.h> #include <stdlib.h> #include <stdbool.h> #include <inttypes.h> #include <stdatomic.h> #include <pthread.h> #include <time.h> //compile: gcc Wall Wextra O3 o ss sssms.c lm lpthread #define M 100800 static inline bool issquare(uint64_t x) { uint64_t sqr = sqrtl(x); return sqr * sqr == x; } static _Atomic uint64_t ga = ATOMIC_VAR_INIT(1); static _Atomic int res=ATOMIC_VAR_INIT(0); static _Atomic int numsol=ATOMIC_VAR_INIT(0); static uint64_t high; int compare( const void* a, const void* b){ int int_a = * ( (int*) a ); int int_b = * ( (int*) b ); if ( int_a == int_b ) return 0; else if ( int_a < int_b ) return 1; else return 1; } const uint64_t A[2]={3467829026060516384ULL,3467829027134258208ULL}; static inline bool valid(uint64_t a){ a&=127; return (A[a>>6]&(1ULL<<(a&63)))==0; } void *worker(void *data){ (void)data; int i,j,count,rm,*arr,asize=4096; uint64_t a,b,c,a2b2,T[64]; bool R64[64]; arr=(int*)malloc(asize*sizeof(int)); while((rm=res++)<M){ if((M&7)==0&&(rm&7)==5)continue; a=(ga/M)*M+rm; if(a<ga) a+=M; for(i=0;i<64;i++)T[i]=0; for(i=0;i<64;i++)R64[i]=false; for(i=0;i<64;i++)R64[(i*i)&63]=true; int r0=(a&63); bool found=false; for(int r1=0;r1<64;r1++) for(int r2=0;r2<64;r2++) if(R64[(r0+r1*r1+r2*r2)&63]&&R64[(r0*r0+r1+r2*r2)&63]&&R64[(r0*r0+r1*r1+r2)&63]){ T[r1]+=(1ULL<<r2); found=true; } if((M&63)==0&&!found) continue; int64_t m=1,m2=M; bool cont=false; for(int p=2;p<=m2;p++)if(m2%p==0){ m2/=p; int64_t q=p; while(m2%p==0){m2/=p;q*=p;} bool check[q],found=false; for(i=0;i<q;i++)check[i]=false; for(int64_t I=0;I<q;I++)check[(I*I)%q]=true; int64_t R0=rm%q; for(int64_t R1=0;R1<q&&!found;R1++) for(int64_t R2=0;R2<q;R2++) if(check[(R0+R1*R1+R2*R2)%q]&&check[(R0*R0+R1+R2*R2)%q]&&check[(R0*R0+R1*R1+R2)%q]){ found=true; break; } if(!found){ cont=true; break; } } if(cont)continue; m=1; m2=M; arr[0]=0; count=1; for(int p=2;p<=m2;p++)if(m2%p==0){ m2/=p; int64_t q=p; while(m2%p==0){m2/=p;q*=p;} int64_t inv=1; for(;(inv*m)%q != 1;inv++); int ns=0; int rt[q]; bool check[q]; for(i=0;i<q;i++)check[i]=false; for(int64_t I=0;I<q;I++)check[(I*I)%q]=true; int64_t R0=rm%q; for(int64_t R1=0;R1<q;R1++) for(int64_t R2=0;R2<q;R2++) if(check[(R0+R1*R1+R2*R2)%q]&&check[(R0*R0+R1+R2*R2)%q]&&check[(R0*R0+R1*R1+R2)%q]){ rt[ns++]=R1; break; } if(asize<count*ns){ asize=count*ns; arr=(int*)realloc(arr,asize*sizeof(int)); } int pos=count*ns1; for(i=count1;i>=0;i) for(j=ns1;j>=0;j){ int64_t x=arr[i]; int64_t r=rt[j]; int64_t mult=((rx)*inv)%q; if(mult<0)mult+=q; arr[pos]=x+mult*m; } m*=q; count*=ns; } qsort(arr,count,sizeof(int),compare); int arr0=arr[0]; int new_arr0=M+arr[0]arr[count1]; for(i=count1;i>0;i) arr[i]=arr[i]arr[i1]; for(;a<=high;a+=M){ if(!valid(a))continue; int rec=0,ext[count]; uint64_t rc=a; for(i=0;i<count;i++)ext[i]=0; arr[0]=0; for(b=arr0;b<=a;){ int sh=(int) (256.0*(sqrt(a*a+(b+1)*(b+1))sqrt(a*a+b*b))); if(sh>rec){ rec=sh; for(i=0;i<count;i++) ext[i]=(arr[i]*sh)>>8; } for(i=0;i<count;i++){ b+=arr[i]; rc+=ext[i]; a2b2=a*a+b*b; while(rc*rc<a2b2)rc++; c=rc*rca2b2; if (c<=b && T[b&63] & (1ULL << (c & 63)) && issquare(a * a + b + c * c) && issquare(a + b * b + c * c) && c>0 && b<=a){ printf("%" PRIu64 " %" PRIu64 " %" PRIu64 "\n", c, b, a); numsol++; } } arr[0]=new_arr0; } }} free(arr); return NULL; } int main(int argc, char **argv) { int t = 1; int i; high = 1000000; if (argc >= 3) { ga = strtoull(argv[1], NULL, 10); high = strtoull(argv[2], NULL, 10); if (argc >= 4) t = atoi(argv[3]); } uint64_t a0=ga,a1=high; printf("Started the a=%" PRIu64 "..%" PRIu64 " range with M=%d using t=%d threads.\n",a0,a1,M,t); pthread_t threads[t  1]; time_t sec=time(NULL); for (i = 0; i < t  1; i++) pthread_create(&threads[i], NULL, worker, NULL); worker(NULL); for (i = 0; i < t  1; i++) pthread_join(threads[i], NULL); printf("Using M=%d with t=%d threads done a=%" PRIu64 "..%" PRIu64 ", found %d solutions in %ld sec.\n", M,t,a0,a1,numsol,time(NULL)sec); return 0; } with fixed a%M you can construct an array that could be the possible values for b%M. And we also avoid a lot of jumping work when you update c, because of the sparse b values. Of course the solutions will appear in random order due to the search for "a" in arithmetic progression. ps. you could try larger M values, but I haven't seen much better running times (and using larger M values could increase the time thanks to the more inits). ps2. M should be divisible by 64, otherwise it is missing some solutions! OK, it is logical, because when it doesn't hold, then the final check at the "if (c<=b && T[b&63]" is not that good, because (a%64) is not fixed in these cases. [And you can still use 128M or 256M etc] Last fiddled with by R. Gerbicz on 20220220 at 00:19 Reason: In text the constant was wrong for M. 

20220220, 14:21  #7  
May 2003
FF_{16} Posts 
Quote:
The latest version I'm running does have a p=3 sieve to reduce the work, and I was contemplating a p=5 layer too, but felt that just applying additional sieve layers was just getting small constant factors. Eventually N^2 would win, I was just putting off the inevitable. 

20220220, 15:58  #8 
Feb 2017
Nowhere
3·19·109 Posts 
Assuming a ≤ b ≤ c [as on the web site and my previous post] we can use
a ≥ (1 + sqrtint(b^2 + c^2))^2  b^2  c^2 to rule out swathes of pairs (b, c). Let f be the fractional part of sqrt(b^2 + c^2). If f is "too small," it will force a to be too large for a ≤ b to hold. We have a ≥ (1  f + sqrt(b^2 + c^2))^2  b^2  c^2, i.e. (1) a ≥ (1  f)^2 + 2*(1  f)*sqrt(b^2 + c^2). Since we are assuming a ≤ b ≤ c, we have sqrt(b^2 + c^2) ≥ sqrt(2)*a. Therefore, we always have 2*(1  f)*sqrt(2)*a < a, or 1  f < 1/(2*sqrt(2)), or (2) f > 1  1/(2*sqrt(2)). 
20220220, 17:50  #9  
"Robert Gerbicz"
Oct 2005
Hungary
3^{2}·179 Posts 
Quote:
For larger ranges it is somewhat faster, but of course its is still testing const*(a1^2a0^2) pair of (a,b) numbers for fixed M (when testing a=a0..a1). You wouldn't go that large, but I think that a<2^31.5M should be true to avoid overflow in unsigned 64 bits. 

20220220, 20:31  #10  
May 2003
3·5·17 Posts 
Quote:
The range I'm testing is 60M70M (it's the range futhest back in my tmux history), so well beyond the 26bits needed for an exact square root. However, we did some tests, and the sqrt test works even when the sqrt loses precision, because the comparison is with the resquared value as integers, it's not comparing a root with an integer as floats. We flung some test numbers at it, and couldn't get it to fail. Is there any other reason to think that 2^25 would be an upper limit to the code? 

20220220, 20:57  #11  
"Robert Gerbicz"
Oct 2005
Hungary
3^{2}×179 Posts 
Quote:
Quote:
Code:
uint64_t missed=0; for(uint64_t i=0;i<4294967296ULL;i++) if(!issquare(i*i))missed++; printf("missed=%" PRIu64"\n",missed); 

Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
Sums of all Squares  davar55  Puzzles  183  20191212 22:31 
Basic Number Theory 12: sums of two squares  Nick  Number Theory Discussion Group  0  20161211 11:30 
Sums of three squares  CRGreathouse  Math  6  20091106 19:20 
Computer Factoring 2 Exponents Simultaneously?  jinydu  Lounge  13  20050307 01:37 
Identifing perfect squares and calculating square roots..  dsouza123  Math  2  20030719 17:17 