mersenneforum.org  

Go Back   mersenneforum.org > Other Stuff > Archived Projects > Prime Cullen Prime

 
 
Thread Tools
Old 2007-04-20, 02:55   #12
geoff
 
geoff's Avatar
 
Mar 2003
New Zealand

13·89 Posts
Default

Quote:
Originally Posted by em99010pepe View Post
2º have the ability to queue up jobs
3º save it's progress on an output file
I have added the workfile and checkpoint features in version 1.0.7, and have also added default names for the input and factors file, so now if you start it without specifying the -p -P -i or -f flags you get this behaviour:

Input sieve read from `sieve.txt'
Factors written to `factors.txt'
Checkpoints written to `checkpoint.txt'
Ranges read from `work.txt'

work.txt is the same format as sr5sieve, one range in billions per line, e.g.:
1500,1600
2000,2200

To run multiple gcwsieve processes you will either need to create a seperate directory for each process, or else specify the file names using the command line switches.

Carlos: I have made the SSE2 a little (4%) faster on my Northwood P4, can you check whether it is still slower than version 1.0.4 on your AMD64? If it is still slower then I will add the 1.0.4 code to the AMD build in the next version.
geoff is offline  
Old 2007-04-20, 06:23   #13
em99010pepe
 
em99010pepe's Avatar
 
Sep 2004

B0E16 Posts
Default

Geoff,

A quick test gave me still slower (5-8 kp/s less)than 1.0.4 version.

Carlos
em99010pepe is offline  
Old 2007-04-20, 08:28   #14
em99010pepe
 
em99010pepe's Avatar
 
Sep 2004

283010 Posts
Default

Geoff,

Just tested the latest version on my P4 3.0Ghz HT and got a decrease of sieve speed from 222 kp/s to 190kp/s.

Carlos
em99010pepe is offline  
Old 2007-04-21, 03:14   #15
geoff
 
geoff's Avatar
 
Mar 2003
New Zealand

48516 Posts
Default

Quote:
Originally Posted by em99010pepe View Post
Just tested the latest version on my P4 3.0Ghz HT and got a decrease of sieve speed from 222 kp/s to 190kp/s.
Thanks. For comparison at p=1500e9 with the 4352 term sieve file: on my 2.9GHz HT P4 I get 290 kp/s, or 505 kp/s by running two hyperthreads.

I'm not really sure what is going on with the speeds here, some of the code is the same as is used in sr5sieve, but the main loop is different so there are more possible complications. It may be that by trying to finely optimise for my own machines, I am making code that will not run fast on any others.
geoff is offline  
Old 2007-04-21, 08:24   #16
em99010pepe
 
em99010pepe's Avatar
 
Sep 2004

2·5·283 Posts
Default

I always had this problem even when I was running sr5sieve. One of the reasons I stopped helping was because the sieve speed was decreasing each time a new version was released.

The previous sieve speeds were measured at p=1800e9.

For my AMD 64 3000+ seems like the optimal cache are L1=16Kb and L2=256Kb, respectively. I also don't know what's going on...the only thing I can help is to test your clients on different machines.

Carlos
em99010pepe is offline  
Old 2007-04-21, 22:55   #17
geoff
 
geoff's Avatar
 
Mar 2003
New Zealand

13·89 Posts
Default

I have made versions 1.0.4a and 1.0.6a using the SSE2 routines from 1.0.4 and 1.0.6, but otherwise they are the same as version 1.0.7.

There is one other setting that you can use to try to speed up the sieve: The -d switch allows you to set the maximum gap between exponents manually. Extra dummy terms will be added to fill any larger gaps. You can see the gap size chosen by running with the -v switch, and try something a bit smaller or larger. The optimal gap size will change with each new sieve file unfortunately.

Carlos: If you happen to remember which sr5sieve version was fastest for your machines I can make it available for download again. But as you can see here with gcwsieve, between three machines -- two of them Pentium 4's -- we have three different versions already :-).
geoff is offline  
Old 2007-04-22, 09:54   #18
em99010pepe
 
em99010pepe's Avatar
 
Sep 2004

2×5×283 Posts
Default

Geoff,

I really can't remember which sr5sieve version was fastest on my machines. Last November I moved all machines to another project...

About gcwsieve, I think the problem here is a memory one.
1.0.4 version with L1=16 Kb and L2=256 Kb is faster (about 6kp/s) than 1.0.4a version with the same cache settings. I noticed the latter detects the cache size memory and 1.0.4 uses as default L1=16 Kb and L2=256 Kb.

Carlos

Last fiddled with by em99010pepe on 2007-04-22 at 09:56
em99010pepe is offline  
Old 2007-04-23, 20:44   #19
em99010pepe
 
em99010pepe's Avatar
 
Sep 2004

2×5×283 Posts
Default

Geoff,

I don't know if this matters but I tried sr1sieve because of this sieve project and I got my fastest times with L1 cache size of 32Kb and L2 cache size of 512Kb, the real ones for my machine.

Carlos
em99010pepe is offline  
Old 2007-06-10, 22:18   #20
geoff
 
geoff's Avatar
 
Mar 2003
New Zealand

13×89 Posts
Default gcwsieve 1.0.8

Compared to the previous version, this one is about 10% faster on my P4 and about 5% faster on my P3. There is no need to upgrade if it turns out to be slower on your machine.

On Windows the p/sec and sec/factor rates are now measured in CPU-seconds for consistency with the Unix versions. (Same as recent versions of sr5sieve). To get a valid comparison with previous versions, run them on an otherwise idle CPU.
geoff is offline  
Old 2007-06-18, 22:40   #21
geoff
 
geoff's Avatar
 
Mar 2003
New Zealand

48516 Posts
Default gcwsieve 1.0.9

Some changes to the 32-bit SSE2 assembler (taken from sr5sieve 1.5.6) have had a big effect on P4 performance. Here are some times for my 2.9 GHz P4 at p=4200e9 using the 2947 term sieve file for the 2.5-5.0 million n range:
Code:
Version  Single Thread      2 Hyperthreads
-------  -------------      --------------
1.0.7:   405 kp/s           700 kp/s
1.0.8:   446 kp/s (+10%)    720 kp/s (+3%)
1.0.9:   644 kp/s (+44%)    786 kp/s (+9%)
I don't know whether other SSE2 machines will benefit by as much, or at all.

My P4 is still better off LLR testing, but my P4/Celeron would now be more productive sieving, even just on the 2.5-5 million range. (It is the same speed as a full P4 for sieving, but only half speed for LLR testing which it is doing at the moment).
geoff is offline  
Old 2007-06-19, 09:43   #22
hhh
 
hhh's Avatar
 
Jun 2005

17516 Posts
Default

Great news indeed.

Sieving the 5-25M range will be done up to 100G in a couple of days, and I think we can start a new sieve drive then. Is that OK for you, or do you suggest to reopen the late 1.5-5M range?

Yours H.
hhh is offline  
 

Thread Tools


All times are UTC. The time now is 01:31.

Fri Aug 7 01:31:42 UTC 2020 up 20 days, 21:18, 1 user, load averages: 1.43, 1.66, 1.65

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.