mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2010-09-29, 21:06   #23
mdettweiler
A Sunny Moo
 
mdettweiler's Avatar
 
Aug 2007
USA (GMT-5)

141518 Posts
Default

I've also tried comparing iteration timings with Prime95 v26.2 vs. 25.11 (both Windows 32-bit) to verify whether the problem is in gwnum, or just PFGW.

In both cases, I used the Advanced>Time option, and ran 1000 iterations of M38000000.

25.11: ~60 ms/iter.
26.2: ~50 ms/iter.

So there's a significant speedup going to version 26.2. Of course, this is a much bigger FFT than that used on the base 5 numbers; so I also tried 1000 iterations of M1100000.

25.11: ~1.2 ms/iter.
26.2: ~1.5 ms/iter.

It seems that version 26.2 is actually slower on this FFT.

Note that Prime95 v26.2 used the Pentium 4 type-3 56K FFT for this number, whereas the base 5 numbers tested earlier were done with a Core2 type-3 128K FFT. However, there does seem to be a commonality in that in both cases, the v26 gwnum program tested slower on my CPU at these low FFTs.
mdettweiler is offline   Reply With Quote
Old 2010-09-29, 21:47   #24
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

11·673 Posts
Default

Quote:
Originally Posted by mdettweiler View Post
I've also tried comparing iteration timings with Prime95 v26.2 vs. 25.11 (both Windows 32-bit) to verify whether the problem is in gwnum, or just PFGW.
You can add "PRP=289184,5,477336,-1" to worktodo.txt to time the exact numbers in question.

It is baffling to me why these smaller FFTs are slower on your Core 2 but not for anyone else. Maybe CPU-Z or one of the other programs that do a more thorough dump of CPU characteristics might shed some light.
Prime95 is offline   Reply With Quote
Old 2010-09-30, 02:26   #25
mdettweiler
A Sunny Moo
 
mdettweiler's Avatar
 
Aug 2007
USA (GMT-5)

624910 Posts
Default

Quote:
Originally Posted by Prime95 View Post
You can add "PRP=289184,5,477336,-1" to worktodo.txt to time the exact numbers in question.

It is baffling to me why these smaller FFTs are slower on your Core 2 but not for anyone else. Maybe CPU-Z or one of the other programs that do a more thorough dump of CPU characteristics might shed some light.
Okay, here's the iteration timings I got for that with Prime95 v25.11 and 26.2:

25.11: ~3.55 ms/iter.
26.2: ~3.45 ms/iter.

Would you know--I get a speed boost with 26.2 after all. It would seem, then, that this is an issue in PFGW and not in gwnum.
mdettweiler is offline   Reply With Quote
Old 2010-09-30, 02:41   #26
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

22·3·523 Posts
Default

Quote:
Originally Posted by mdettweiler View Post
Okay, here's the iteration timings I got for that with Prime95 v25.11 and 26.2:

25.11: ~3.55 ms/iter.
26.2: ~3.45 ms/iter.

Would you know--I get a speed boost with 26.2 after all. It would seem, then, that this is an issue in PFGW and not in gwnum.
Not necessarily. I've already shown the timings on Windows for the same build. Such a vast proportion of time is spent in gwnum that it is unlikely that PFGW could cause such a significant slow down. There is clearly something curious going on here though.

Last fiddled with by rogue on 2010-09-30 at 02:42
rogue is offline   Reply With Quote
Old 2010-09-30, 06:10   #27
mdettweiler
A Sunny Moo
 
mdettweiler's Avatar
 
Aug 2007
USA (GMT-5)

3×2,083 Posts
Default

Quote:
Originally Posted by rogue View Post
Not necessarily. I've already shown the timings on Windows for the same build. Such a vast proportion of time is spent in gwnum that it is unlikely that PFGW could cause such a significant slow down. There is clearly something curious going on here though.
Here's what I get comparing iteration times on 289184*5^477336-1 for LLR 3.8.1 and 3.8.2:

3.8.1: ~3.45 ms/iter.
3.8.2: ~3.40 ms/iter.

Just like Prime95, LLR gets a speed increase with 3.8.2 as expected.

What I really should do, though, is run the test from start to finish on each version of both Prime95 and LLR. We've already seen that in such a test PFGW 3.3.6 is inexplicably faster than 3.4.0, but it would be interesting to see if the same holds true for Prime95 and LLR. Sure, the ms/iter. figures show the newer version to be faster in both such cases, but in each case the figures fluctuated rather wildly and I had to come up with a "gut estimate average" to post here. The potential for experimental error is, needless to say, rather large.

George, quick question: is there a way to make Prime95 print the exact wall-clock runtime at the end of a test, like PFGW and LLR do? As it is now, there's not really an easy way to directly measure this with Prime95.
mdettweiler is offline   Reply With Quote
Old 2010-09-30, 12:39   #28
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

22·3·523 Posts
Default

Quote:
Originally Posted by mdettweiler View Post
Here's what I get comparing iteration times on 289184*5^477336-1 for LLR 3.8.1 and 3.8.2:

3.8.1: ~3.45 ms/iter.
3.8.2: ~3.40 ms/iter.

Just like Prime95, LLR gets a speed increase with 3.8.2 as expected.

What I really should do, though, is run the test from start to finish on each version of both Prime95 and LLR. We've already seen that in such a test PFGW 3.3.6 is inexplicably faster than 3.4.0, but it would be interesting to see if the same holds true for Prime95 and LLR. Sure, the ms/iter. figures show the newer version to be faster in both such cases, but in each case the figures fluctuated rather wildly and I had to come up with a "gut estimate average" to post here. The potential for experimental error is, needless to say, rather large.

George, quick question: is there a way to make Prime95 print the exact wall-clock runtime at the end of a test, like PFGW and LLR do? As it is now, there's not really an easy way to directly measure this with Prime95.
I'm curious. Is 3.4.0 slower for other n and other bases?
rogue is offline   Reply With Quote
Old 2010-09-30, 14:28   #29
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

11×673 Posts
Default

Quote:
Originally Posted by mdettweiler View Post
George, quick question: is there a way to make Prime95 print the exact wall-clock runtime at the end of a test, like PFGW and LLR do? As it is now, there's not really an easy way to directly measure this with Prime95.
The date/time is displayed at the start of every line output to the screen.
Prime95 is offline   Reply With Quote
Old 2010-09-30, 14:58   #30
mdettweiler
A Sunny Moo
 
mdettweiler's Avatar
 
Aug 2007
USA (GMT-5)

3×2,083 Posts
Default

Quote:
Originally Posted by rogue View Post
I'm curious. Is 3.4.0 slower for other n and other bases?
For base 2, 3.4.0 is faster:
Code:
PFGW Version 3.3.6.20100908.Win_Stable [GWNUM 25.14]
2071*2^270307-1 is composite: RES64: [816B1DBFBFC67D09] (123.9523s+0.0004s)
 
 
PFGW Version 3.4.0.32BIT.20100925.Win_Dev [GWNUM 26.2]
2071*2^270307-1 is composite: RES64: [816B1DBFBFC67D09] (108.1168s+0.0003s)
Ditto for base 3:
Code:
PFGW Version 3.3.6.20100908.Win_Stable [GWNUM 25.14]
170979002*3^50000+1 is composite: RES64: [CBA9FAA11257431A] (15.1078s+0.0014s)
 
PFGW Version 3.4.0.32BIT.20100925.Win_Dev [GWNUM 26.2]
170979002*3^50000+1 is composite: RES64: [CBA9FAA11257431A] (12.1903s+0.0017s)
And for another n on base 5, 3.4.0 is again faster:
Code:
PFGW Version 3.3.6.20100908.Win_Stable [GWNUM 25.14]
18656*5^65474-1 is composite: RES64: [BB2682E39AA9CB16] (42.5636s+0.0034s)
 
PFGW Version 3.4.0.32BIT.20100925.Win_Dev [GWNUM 26.2]
18656*5^65474-1 is composite: RES64: [BB2682E39AA9CB16] (37.9236s+0.0038s)
The problem, it would seem, is localized to this particular range of n on base 5 (possibly this particular FFT size).

Mark, did you by chance check which FFT 3.4.0 chose for the two (larger) base 5 tests on your CPU? Mine used "Core2 type-3 FFT length 128K"; maybe yours chose a different CPU architecture? (Grasping at straws here...)
Quote:
Originally Posted by Prime95 View Post
The date/time is displayed at the start of every line output to the screen.
But that only prints out at the end of each test; what I'd need for it to do is to print the time at the beginning as well so I could subtract and get the runtime.

I suppose what I could do is stick a miniscule test (n=100 or so) in the worktodo.txt file right before the base 5 test. That way, it prints out the time at the tiny test's completion (i.e., at the start of the base 5 test) and again at the end of the base 5 test. I'll try that later today.
mdettweiler is offline   Reply With Quote
Old 2010-09-30, 15:15   #31
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

22×3×523 Posts
Default

This is what 3.4.0 chose on Win64:

Special modular reduction using zero-padded Core2 type-3 FFT length 128K, Pass1=128, Pass2=1K on 289184*5^477336-1
rogue is offline   Reply With Quote
Old 2010-09-30, 15:27   #32
mdettweiler
A Sunny Moo
 
mdettweiler's Avatar
 
Aug 2007
USA (GMT-5)

186916 Posts
Default

Quote:
Originally Posted by rogue View Post
This is what 3.4.0 chose on Win64:

Special modular reduction using zero-padded Core2 type-3 FFT length 128K, Pass1=128, Pass2=1K on 289184*5^477336-1
That's the same as what I got:

Special modular reduction using zero-padded Core2 type-3 FFT length 128K, Pass1=128, Pass2=1K on 289184*5^477336-1

Does the 32-bit version by chance give you something different?
mdettweiler is offline   Reply With Quote
Old 2010-09-30, 15:45   #33
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

22·3·523 Posts
Default

Quote:
Originally Posted by mdettweiler View Post
That's the same as what I got:

Special modular reduction using zero-padded Core2 type-3 FFT length 128K, Pass1=128, Pass2=1K on 289184*5^477336-1

Does the 32-bit version by chance give you something different?
Nope.
rogue is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
A possible bug in LLR/PFGW while using GWNUM (no bug in P95) Batalov Software 77 2015-04-14 09:01
PFGW 3.2.0 has been Released rogue Software 94 2010-09-14 21:39
PFGW 3.2.3 has been Released rogue Software 10 2009-10-28 07:07
PFGW 3.2.2 has been Released rogue Software 20 2009-08-23 12:14
PFGW 3.2.1 has been released rogue Software 5 2009-08-10 01:43

All times are UTC. The time now is 04:15.

Mon Apr 12 04:15:54 UTC 2021 up 3 days, 22:56, 1 user, load averages: 2.49, 2.42, 2.45

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.