mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2003-10-08, 14:40   #1
Joe O
 
Joe O's Avatar
 
Aug 2002

10158 Posts
Default Optimizing Athlon/Pentium3 Code

Prime95 said:
Quote:
Actually, I'm pretty sure I can get another 10% for Athlons and Pentium 3 computers by employing one of the tricks I learned in the P4 optimizations. Unfortunately, it requires a major rewrite of the x86 FFT code. This is not something I relish doing!

Sadly, I also know of a way to improve the proth primes search (seventeen-or-bust) by a substantial amount, but again it requires a major coding effort.
I was wondering if you had started on either of these. Do you need any help?
Joe O is offline   Reply With Quote
Old 2003-10-08, 22:38   #2
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

24·17·29 Posts
Default

I started and stopped the effort. My P3 testbed (actually Celeron) died and I've also been kind of busy.

If you look around you'll see where I asked for some P3 benchmarks on two FFT sizes. The results were mixed - less than I had hoped for. The Athlon improvement was more than 10% if I recall correctly.

I've not started the Proth speedup, but the two FFT sizes I did change were coded in such a way that the Proth speedup can be implemented too.

What would be required to complete the effort? All x87 FFT sizes would need to be recoded too. The auxillary add and subtract routines need to be rewritten and tested. The Proth mod routine needs to be rewritten, etc. Even the P4 proth mod routine needs rewriting.

The downside is the new version seems to be slower for P2 and older CPUs. Maintaining both code paths is not reasonable, so these slower CPUs would be stuck with version 23 or suffer with a slower newer version.
Prime95 is online now   Reply With Quote
Old 2003-10-09, 02:28   #3
outlnder
 
outlnder's Avatar
 
Aug 2002

2·3·53 Posts
Default

What about an AMD client and an Intel client??
outlnder is offline   Reply With Quote
Old 2003-10-10, 01:45   #4
PageFault
 
PageFault's Avatar
 
Aug 2002
Dawn of the Dead

5×47 Posts
Default

Aren't PII's by default put on trial factoring anyways, i.e., doublecheck cutoff at 500 MHz minimum? Besides, the enthusiaist who would still run archaic hardware would likely know which client to use anyways ... I doubt these machines contribute much to LL testing in the first place ... I don't want to think of my 350 crunching a 20000000M exponent ...

Athlons however do have a significant impact on production ... despite our mass adoption of Northwood technology, TPR still has hundreds of AMD machines ... they need the optimization ...

Quote:
Originally posted by Prime95

The downside is the new version seems to be slower for P2 and older CPUs. Maintaining both code paths is not reasonable, so these slower CPUs would be stuck with version 23 or suffer with a slower newer version.

Last fiddled with by PageFault on 2003-10-10 at 01:48
PageFault is offline   Reply With Quote
Old 2003-10-12, 14:50   #5
Joe O
 
Joe O's Avatar
 
Aug 2002

3·52·7 Posts
Default

Quote:
Originally posted by Prime95
...
What would be required to complete the effort? All x87 FFT sizes would need to be recoded too.
...
The downside is the new version seems to be slower for P2 and older CPUs. Maintaining both code paths is not reasonable, so these slower CPUs would be stuck with version 23 or suffer with a slower newer version.
How about starting with the larger FFT sizes? Especially the new SSE2 one for 77M. I don't know how many P2s are in use on LL/DC, but presumably they are working on the smaller FFT sizes. If not, then they will have to stay with the current client. This would give the project the boost in the larger FFT ranges. 10%+ is nothing to sneeze at!
Joe O is offline   Reply With Quote
Old 2003-10-14, 04:13   #6
cheesehead
 
cheesehead's Avatar
 
"Richard B. Woods"
Aug 2002
Wisconsin USA

22·3·641 Posts
Default

Quote:
Originally posted by PageFault
Aren't PII's by default put on trial factoring anyways, i.e., doublecheck cutoff at 500 MHz minimum? Besides, the enthusiaist who would still run archaic hardware would likely know which client to use anyways ... I doubt these machines contribute much to LL testing in the first place
But trial is not the only method of factoring, and LL testing is not the only application of FFT multiplication.

P-1 factoring is included in PrimeNet factoring assignments, and it uses FFT multiplication.

So the effect of an FFT change on pre-P3 models does need to be considered.

Last fiddled with by cheesehead on 2003-10-14 at 04:17
cheesehead is offline   Reply With Quote
Old 2003-10-15, 04:23   #7
cheesehead
 
cheesehead's Avatar
 
"Richard B. Woods"
Aug 2002
Wisconsin USA

22×3×641 Posts
Default

Quote:
Originally posted by cheesehead
P-1 factoring is included in PrimeNet factoring assignments
... or, rather, I thought that a PrimeNet factoring assignment was for both trial and P-1 factoring of the assigned exponent. But I find that the worktodo line is "Factor=...", which is only trial factoring.

Sorry, PageFault ... I mis-remembered, and I should've checked before posting.
cheesehead is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Optimizing a $1000 GIMPS investment koskol Hardware 11 2013-02-10 23:57
Optimizing for CPU architecture: unknown Intel almostfrugal Information & Answers 1 2012-09-19 14:22
Optimizing Core2 quad in Windows XP John Rheinstein Hardware 18 2009-09-23 16:14
Optimizing step 2 of ECM on Prime95 alpertron Software 4 2006-01-11 17:27
Optimizing for Athlon?? Paulie Software 6 2002-09-13 23:01

All times are UTC. The time now is 21:02.


Mon May 23 21:02:56 UTC 2022 up 39 days, 19:04, 0 users, load averages: 1.82, 1.72, 1.70

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔