Register FAQ Search Today's Posts Mark Forums Read

2009-08-11, 14:47   #221
pschoefer

Jan 2007
.de

2·32 Posts

Quote:
 Originally Posted by Ken_g6 I was wondering when someone would ask about Windows. I'll look at making a 32-bit Windows .exe tomorrow. But I don't have a 64-bit Windows. Is there a gcc for 32-bit Windows or 32 or 64-bit Linux that will compile a 64-bit Windows binary? Or does someone want to compile it for me?
There's MinGW-w64 for Windows and Linux. I have it running on a Win64 host, so I could try to compile it, too.

 2009-08-11, 16:19 #222 biwema     Mar 2004 1011111012 Posts I see you intend to sieve 20000 n parallel, but only upt to 10000000. (a total range of 200G). According to this thred this would take 20000 times as long as one individual n. Did you (or the sieving tool) taks this property into consideration: Group for example 10 .. 20 n together by moving the exponent to the mantissa. For example: mantissa exponent 1000001 480000 1000001 480001 1000001 480002 ... 1000001 480015 1000001 480016 convert to: 1000001 480000 2000002 480000 4000004 480000 8000008 480000 ... 32768032768 480000 65536065536 480000 In that case you need to sieve up to 20 times less N, depending how large the mantissa can be. So the sieving is onlz 1000 times slower instead of 20000 times. Sieving would neither take longer nor take more memory in case you use hashtables (recommended anyway because of the large ranges) Of course, for LLR you can convert back the candidtes that you can still benefit of small mantissa.
2009-08-11, 16:35   #223
kar_bon

Mar 2006
Germany

2×1,433 Posts

Quote:
 Originally Posted by biwema Of course, for LLR you can convert back the candidtes that you can still benefit of small mantissa.
no need for that, LLR will reduce automatically from 65536065536*2^480000-1 to 1000001*2^480016-1!
Attached Thumbnails

Last fiddled with by kar_bon on 2009-08-11 at 16:40

 2009-08-11, 20:43 #224 Ken_g6     Jan 2005 Caught in a sieve 5×79 Posts OK, it's time for the good, the bad, and the ugly! First, the slightly bad news: I haven't figured out how to compile a 64-bit Windows version. Anyone's welcome to try; but I don't know if Geoff would prefer to do it himself? Second, the really good news, for 32-bit users on processors with SSE2. I made an SSE2-enabled 32-bit app that's about 75% as fast as the 64-bit app, for many-N sieves. That's about three times as fast as the regular 32-bit app! (I haven't checked which is faster for single-N sieves; but there shouldn't be much difference.) My zipfile now includes both 32-bit versions for both Windows and Linux, and a 64-bit Linux version. Finally, the ugly - which is what coding biwema's suggestion would be. Right now, checking the second N is about 60-80 times as fast as the first N, because all one has to do is divide the result for the first N by 2 (mod P). This takes two short lines of code. To do what biwema suggests would first require checking whether even K's between kmax and, say, 65536*kmax, have enough 0's on the right that shifting right would put the K between kmin and kmax. This is unlikely, so it would probably require counting the 0's individually. This is pretty much as slow as doing the above divide, unless there's a processor-specific instruction to count right 0's (there may be in SSE4 or something.) Even assuming you get that working quickly, you still have to divide by 2^16 (mod P). Of course the way to do this is to multiply by 2^-16 (mod P). I'm sure this would never be faster than the new SSE2 code in 32 bits, and it's not all that fast even with 64 bits. - Edit: strike that, that's a multiply, not an exponentiation. It's still slow, but not that slow. Still, I'm not planning to try this soon. But if anyone's willing to try it, please, be my guest. Last fiddled with by Ken_g6 on 2009-08-11 at 21:14
2009-08-11, 21:00   #225
axn

Jun 2003

3·5·17·19 Posts

Quote:
 Originally Posted by Ken_g6 Second, the really good news, for 32-bit users on processors with SSE2. I made an SSE2-enabled 32-bit app that's about 75% as fast as the 64-bit app, for many-N sieves. That's about three times as fast as the regular 32-bit app! (I haven't checked which is faster for single-N sieves; but there shouldn't be much difference.) My zipfile now includes both 32-bit versions for both Windows and Linux, and a 64-bit Linux version.
Yay I am testing this one as soon as I get back from work.

Quote:
 Originally Posted by Ken_g6 Finally, the ugly - which is what coding biwema's suggestion would be.
Biwema assumed that sieving 20000n in parallel is 20000x slower. But we know that that is not true. So the evaluation needs to be adjusted accordingly.

2009-08-11, 21:27   #226
pschoefer

Jan 2007
.de

2·32 Posts

Quote:
 Originally Posted by Ken_g6 First, the slightly bad news: I haven't figured out how to compile a 64-bit Windows version. Anyone's welcome to try; but I don't know if Geoff would prefer to do it himself?
I think I got it. It's running very fast (more than twice as fast as the 0.2.3 I used before) and finding the same factors, so I think it's OK, although I got those warnings:
Code:
gcc -Wall -O3 -DNDEBUG -D_REENTRANT -m64 -I. -I.. -s -o tpsieve-x86_64-windows.exe ..\main.c ..\sieve.c ..\clock.c ..\util.c app.c -lm -lpthread
..\main.c:529: warning: cast from pointer to integer of different size
..\main.c:653: warning: cast from pointer to integer of different size
..\main.c: In function 'main':
..\main.c:755: warning: cast to pointer from integer of different size

 2009-08-11, 21:35 #227 Skligmund     Dec 2006 Anchorage, Alaska 4E16 Posts I'm not savvy with the tpseive as of yet (never even used it), but which is quickest? I would assume linux x64 is faster than win32?
2009-08-11, 22:07   #228
Mini-Geek
Account Deleted

"Tim Sorbera"
Aug 2006
San Antonio, TX USA

17×251 Posts

Quote:
 Originally Posted by Ken_g6 Second, the really good news, for 32-bit users on processors with SSE2. I made an SSE2-enabled 32-bit app that's about 75% as fast as the 64-bit app, for many-N sieves. That's about three times as fast as the regular 32-bit app! (I haven't checked which is faster for single-N sieves; but there shouldn't be much difference.) My zipfile now includes both 32-bit versions for both Windows and Linux, and a 64-bit Linux version.

So the 32-bit version was only 25% the speed of the 64-bit version? (from: "75% as fast as the 64-bit app" and "three times as fast as the regular 32-bit app") I thought 32-bit versions of math programs, in general, were at worst 50% the speed of the 64-bit versions?

2009-08-11, 23:28   #229
Ken_g6

Jan 2005
Caught in a sieve

18B16 Posts

pschoefer: I get those warnings every time too, compiling for 64-bit Linux. It doesn't seem to be a problem. You can expect to find the same factors, but occasionally they'll be in a different order.

Skligmund: Linux vs. Windows shouldn't matter. (In theory, that is; no guarantees.) x64 > SSE2 > 32 bit.

Quote:
 Originally Posted by Mini-Geek So the 32-bit version was only 25% the speed of the 64-bit version? (from: "75% as fast as the 64-bit app" and "three times as fast as the regular 32-bit app") I thought 32-bit versions of math programs, in general, were at worst 50% the speed of the 64-bit versions?
That's the way it is. I don't think there's really a hard and fast rule. Maybe that only applies to 32-bit apps with SSE2?

2009-08-12, 00:41   #230
axn

Jun 2003

3·5·17·19 Posts

Quote:
 Originally Posted by axn Yay I am testing this one as soon as I get back from work.
Something's wrong :surprised I'm getting 7 times the speed from the 0.2.3 client!!! (190k to 1.3M)

2009-08-12, 01:02   #231
Mini-Geek
Account Deleted

"Tim Sorbera"
Aug 2006
San Antonio, TX USA

17·251 Posts

Quote:
 Originally Posted by axn Something's wrong :surprised I'm getting 7 times the speed from the 0.2.3 client!!! (190k to 1.3M)
I'm getting an extreme speedup too: 2.481 MPsec up from something like 300 KPsec (this is both cores of my Athlon), which is around 8 times faster! However, this isn't quite comparable, since I was sieving 10G-15G, and am now sieving 55G-80G. (and if it matters, n=485k-490k now instead of 480k-485k) Anyone know what sort of speed up I should expect based on that alone and how much is based on the SSE2 speed up? I'm certainly not complaining, just wanted to check that nothing's going wrong.

 Similar Threads Thread Thread Starter Forum Replies Last Post Lennart Conjectures 'R Us 31 2014-09-14 15:14 philmoore Five or Bust - The Dual Sierpinski Problem 66 2010-02-10 14:34 ltd Prime Sierpinski Project 76 2008-07-25 11:44 ltd Prime Sierpinski Project 26 2005-11-01 07:45 R.D. Silverman Factoring 7 2005-09-30 12:57

All times are UTC. The time now is 22:55.

Sat Jan 23 22:55:23 UTC 2021 up 51 days, 19:06, 1 user, load averages: 1.49, 1.60, 1.64